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ABSTRACT 


A Quantitative  Analysis  of  Estimating 
Accuracy  In  Software  Development 
(August  1976) 

Philip  Francis  Cehrlng,  Jr. 

B.S.,  U.S.  Naval  Academy 
M.S.,  Georgia  Institute  of  Technology 
Chairman  of  Advisory  Committee:  Or.  Udo  W.  Pooch 

This  research  quantitatively  examines  the  estimating  accuracy  of 
over  5000  standardized  resource  consuming  activities  from  39  soft- 
ware development  projects  of  various  size, which  were  accomplished  at 
the  U.S.  Air  Force  Data  Systems  Design  Center.-*  The  activity  data 
pertaining  to  planned  hour  estimates  and  actual  expenditures  were 
collected  by  an  automated  project  management  system  (PARMIS)  as 
the  data  were  generated. 

The  dissertation  hypothesizes  that  specific  activities  can  be 
isolated  which  consistently  have  a greater  Influence  on  whether  a 
software  development  project  will  be  successful  in  terms  of  cost  and 
schedule  estimates.  The  arithmetic  and  percent  differences  between 
estimated  and  observed  hour  expenditures  are  the  elementary  variables 
used  to  Investigate  estimating  accuracy.  Various  summarizing  and 
statistical  techniques  are  employed  to  reveal  the  Information  in- 
herent In  the  data,  and  to  Identify,  If  possible,  a correlation  be- 
tween the  selected  activities  and  the  final  difference  between  the 
total  hours  estimated  and  expended  for  the  project.  The  findings 
from  the  data  source  used  clearly  support  the  hypothesis.  However, 
no  correlate  as  found  between  the  activities  which  have  the  most 
Influence  o ting  accuracy  in  a software  development  project 

and  other  crltf  such  as  the  total  project  difference.  The 
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primary  conclusion  of  this  work  is  that  software  estimation  is  still 
very  poor  and  inconsistent  because  the  existing  model  for  software 
development  and  traditional  estimating  techniques  are  Incompatible. 

A new  development  model  is  described  in  addition  to  a recommendation 
for  a centralized,  standardized,  software  project  management  system 
which  would  service  several  different  agencies  developing  different 
types  of  projects. 
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CHAPTER  I 

INTRODUCTION 


The  Problem 

It  is  generally  accepted  In  both  the  professional  and  academic 
worlds  that,  at  present,  it  is  almost  impossible  to  accurately  es- 
timate the  time  and  cost  of  the  development  of  the  software  for  new 
automated  data  systems  [86].  Very  few,  if  any,  completed  major  soft- 
ware development  projects  have  met  four  recognized  criteria  for  suc- 
cess . 

1.  Preliminary  cost  estimates  must  be  accurate  within 
reasonable  bounds. 

2.  Production  schedules  must  be  met. 

3.  The  operational  requirements  and  design  performance 
criteria  must  be  achieved. 

4.  The  product  must  be  reasonably  error  free  and  funda- 
mentally dependable  when  installed  in  an  operational 
environment  [105,  136]. 

The  objective  of  producing  reliable  software  is  certainly  not  at  odds 
with  the  need  to  reduce  the  high  cost  of  software  development  [53]. 

Previous  research  into  the  problem  of  accurately  forecasting 
software  development  has  suffered  from  overly  ambitious  objectives. 
This  has  often  been  the  fault  of  the  research  sponsors  who  want  to  be 
able  to  demonstrate  a solution  to  a problem  justifying  the  expended 
research  money.  For  example,  in  the  middle  1960 's  the  U.S.  Air  Force 
spent  hundreds  of  thousands  of  dollars  to  solve  the  problem  of  accur- 
ately estimating  software  development.  VIhat  they  received  for  their 
money  was  a set  of  equations  designed  to  assist  in  estimating  span 
days,  man  hours,  etc.  for  programming  a.id  testing  computer  programs 
[82].  The  research  agency  promptly  disclaimed  the  accuracy  of  these 
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equations  for  various  reasons.  The  foremost  reason  was  Inadequate 
data  which  had  to  be  collected  by  questionnaire  after  the  development 
projects  had  been  completed.  Attempting  to  leap  from  the  existing 
meager  knowledge  of  how  to  accurately  estimate  the  cost  of  the  soft- 
ware development  process,  which  Is  complex  to  say  the  least,  and 
little  understood  by  management  and  technician  alike,  to  an  equation 
which  will  accomplish  the  task.  Is  overly  optimistic  and  not  very 
sclentlf Ic. 

What  Is  required  is  basic  research,  accomplished  in  small  steps, 
which  will  reveal  some  of  the  underlying  idiosyncrasies  about  the 
software  development  process.  As  with  all  research,  each  discovery 
made  will  provide  more  insight  into  the  development  process  and  the 
concomitant  estimation  problem  and  create  a broader  foundation  of 
knowledge  for  further  study.  The  necessary  data  are  beginning  to 
be  collected  in  management  information  systems  us»  to  control  soft- 
ware development  projects  [ 30| . These  systems  tl  elves  and  the 

data  contained  therein  need  to  be  scrutinized  and  a .died.  Do  they 
contain  the  proper  kind  of  data  for  research?  What  do  the  data  them- 
selves reveal?  Under  what  kind  of  organizational  structure  and  man- 
agement environment  was  the  data  collected?  How  disciplined  was  the 
project  planning  and  data  collection  and  reporting?  What  assumptions 
does  the  system  rely  on?  Software  engineering  simply  must  know  more 
about  Itself  before  that  discipline  shall  be  provided  some  gestalt  tool 
(in  terms  of  handy  equations)  which  will  resolve  its  estimating  and 
cost  dilemmas. 

Alfred  Pletrasanta  has  concluded  that: 

Many  of  the  problems  of  resource  estimating  are 
symptoms  of  an  underlying  ignorance  of  the  process  of 
program  system  development  for  which  the  estimates  are 
being  made.  The  serious  students  of  estimating  must  first 
be  willing  to  probe  deeply  into  the  fascinating  and  complex 
system  development  process;  to  uncover  the  phases  and 
functions  of  the  process;  to  highlight  the  subtle  Inter- 
relationships of  the  program  system  being  developed  and 
the  project  organization  doing  the  developing. 

To  the  fainthearted,  such  a study  can  be  self- 
defeating,  because  It  will  uncover  dozens  of  factors  that 
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Influence  estimates  and  will  lead  to  the  conclusion  that 
the  process  Is  overwhelmed  by  unpredictable  variability. 
Estimating  can  then  be  considered  an  exercise  In  futility. 

Those  who  persevere,  however,  will  recognize  that 
examining  the  Influencing  variables  and  their  causal 
relationships  is  precisely  what  is  required  If  estimates 
are  ever  to  be  Improved.  Only  then  can  we  do  meaningful 
quantitative  research  and  scientific  analysis  of  resource 
requirements.  We  are  never  likely  to  eliminate  unpredict- 
able variability,  but  we  should  be  able  to  go  a long  way 
toward  Improving  predictability  far  above  today's  primi- 
tive state-of-the-art.  [116] 

Wolverton  sagaciously  observes  that  the  software  Industry  is 
young,  growing  and  marked  by  rapid  change  In  technology  and  applica- 
tion. It  is  not  surprising,  then,  that  the  ability  to  estinwite  costs 
is  still  relatively  underdeveloped  (158).  As  true  as  this  statement 
may  be,  there  exists  an  extensive  bibliography  on  the  subject  matter, 
as  demonstrated  by  the  1,  “jCK)  entries  at  MITRE  Corporation  [30).  There 
does  not  seem  to  be  any  problem  with  admitting  that  there  is  a prob- 
lem. At  the  1968  NATO  Software  Engineering  Conference,  Professor 
Edsger  Dijkstra  observed  that  the  general  admission  of  the  existence 
of  software  failure,  in  that  group  of  responsible  and  renowned 
people,  was  the  most  refreshing  experience  he  had  had  in  years;  the 
admission  of  shortcoming  being  the  primary  condition  for  improvement 
[106].  The  admission  notwithstanding.  Improvement  in  estimating 
software  development  has  not  been  very  rapid  or  very  significant. 

Some  of  the  conditions  contributing  to  the  limited  understanding  of 
software  development  and  Inability  to  predict  the  time  and  resources 
required  in  this  development  have  been  listed  by  Bratman: 

1.  Lack  of  discipline  and  repeatability. 

2.  Lack  of  development  vlsability. 

3.  Changing  performance  requirements. 

4.  Lack  of  design  verification  tools. 

5.  Lack  of  software  reuseability . [20] 

It  is  also  interesting  to  note  that  there  is  a pessimistic  fore- 
cast regarding  improvements  In  software  production  into  the  1980' s 
in  'even  the  more  recent  studies  and  literature  [78,  30].  In  fact, 
progress  in  computing  in  the  future  will  probably  be  severely  strained 
by  the  need  for  more  accurate  and  dependable  methods  for  estimating 
and  controlling  development  costs  and  schedules. 
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Software  and  the  Development  Process 

This  research  will  examine  the  problem  of  accurately  estimating 
the  time  of  some  of  the  activities  accomplished  in  the  development 
of  new  software.  In  the  context  of  this  work  software  will  mean 
any  computer  program  or  module  thereof.  A program  is  a set  of 
transformations  and  other  relationships  over  sets  of  data  and  con- 
tainer structures  (records,  words,  sections,  etc.)  [106].  Software 
is  used  in  its  broadest  sense,  and  may  Include  operating  system  pro- 
grams, utility  programs  or  application  programs.  A software  develop- 
ment project  may  encompass  the  design,  coding,  testing,  etc.  of  a 
single  program  or  a series  of  Interrelated  programs.  A software 
system  implies  a group  of  Interrelated  programs.  New  software  devel- 
opment is  generally  a standardized  process  through  which  software 
evolves  from  an  idea  to  a useful  system  operating  on  a computer. 

The  traditional  model  for  a software  development  project  includes 
the  following  phases  which  may  have  to  take  place,  to  one  degree 
or  another: 

1.  Feasibility  study, 

2.  Requirements  analysis, 

3.  System  design, 

U.  Program  design, 

5 . Cod ing , 

6.  Testing, 

7.  Documentation,  and 

8.  Implementation. 

Five  of  the  above  development  phases  are  used  in  this  research:  (1) 
FLaslbllity  study,  (2)  Requirements  analysis,  (3)  Program  design,  (A) 
Test  and  (5)  Documentation.  Investigation  was  limited  to  these 
phases  because  they  are  the  most  commonly  used  and  standardized 
development  phases  available  from  the  data  source.  Standard  defini- 
tions for  the  development  phases  and  the  resource  consuming  activities 
accomplished  within  the  phases  are  contained  in  Appendix  1 and  2. 

These  definitions  have  been  extracted  from  the  user  manual  for  the 
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software  development  project  control  system  at  the  U.S.  Air  Force 
Data  Systems  Design  Center  [148]. 

Schwartz  [129]  reduces  the  above  list  of  eight  phases  to  the 
following  four  major  phases: 

1.  Requirements  specification. 

2.  System  design. 

3.  Programming  (coding),  and 

4.  Checkout. 

Within  each  of  the  listed  phases  a multitude  of  resource  consuming 
activities  can  be  defined.  These  activities  represent  work  to  be 
done,  and  project  progress  as  the  activities  are  completed. 

System  development  has  become  a structured  professional  discip- 
line. Predefined  activities  have  been  established,  tested  and  altered. 
These  form  a basis  for  an  orderly,  continuing  process  under  which 
system  development  is  carried  out  by  an  Interdisciplinary  project  team. 
These  teams  Include  participation  by  users,  system  analysts,  program- 
mers, computer  operations  personnel,  operating  management,  and  others 
[131].  Telchroew  has  even  experimented  with  automating  this  system 
building  process  [145].  Although,  relatively  well  defined,  the 
software  development  model  Is  rarely  straight-forward.  The  process 
frequently  Involves  numerous  Iterations  among  the  phases  of 
development  and  the  activities  within  the  phases.  These  Iterations 
are  a result  of  the  knowledge  gained  as  the  system  is  being  gener- 
ated . 

There  Is  general  agreement  regarding  the  necessary  steps,  in  any 
software  development  effort,  whether  large  or  small,  regardless  of 
whose  list  Is  used.  But,  Schwartz  points  out,  understanding  and 
agreement  regarding  the  process  almost  seem  to  stop  there.  The  only 
other  area  of  agreement,  that  he  identifies.  Is  that  the  development 
of  software  Is  a terrific  problem,  little  understood  and  fouled  up 
with  terrifying  frequency  [129]. 


Research  Objective 


The  objective  of  this  research  is  to  quantitatively  analyze 
software  development  resource  consuming  activities.  The  most  ele- 
mentary variable  will  be  the  difference  (both  arithmetic  and  per- 
cent difference)  between  the  hours  estimated  to  complete  a standard 
activity  and  the  hours  actually  expended.  Thousands  of  these 
standard  activities  have  been  estimated  and  accomplished  across  hun- 
dreds of  software  development  projects  at  the  U.S.  Air  Force  Data 
System  Design  Center.  The  project  control  system  used  by  that  govern- 
ment agency  collected  these  data  (i.e.  planned  estimates  and  actual 
expenditures)  as  they  were  generated.  In  contrast  to  information 
collected  by  questionnaire  after  a project  is  completed.  Using 
various  statistical  techniques,  the  research  will  deal  with  only 
what  these  limited  data  can  reveal. 

In  terms  of  a general  hypothesis  this  research  effort  can  be 
described  as  follows: 

Hypothesis:  That  specific  resource  consuming  activities 
can  be  identified  which  consistently  have 
the  most  significant  influence  on  inaccurate 
software  development  project  estimates. 

Other  questions  which  may  be  addressed  Include: 

1.  How  was  the  development  effort  distributed  by  phase — 
as  compared  to  the  development  phase  estimates? 

2.  Which  activities  and  phases  consume  the  majority  of 
resources? 

3.  Which  activities  and  phases  have  the  largest  variance 
between  the  estimated  and  actual  time  consumed? 

4.  Does  the  fluctuation  between  estimates  and  expendi- 
tures of  specific  activities  consistently  have  some 
correlation  to  the  known,  final  project  estimate  and 
actual  expenditure  totals? 
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Fleischer  (54)  believes  that  programming  is  becoming  the  domin- 
ant cost  factor.  Management  is  thus  being  pressed  to  closely  study 
the  factors  that  influence  cost  and  productivity  of  software  develop- 
ment projects.  The  insights  to  be  gained,  according  to  Fleischer, 
will  save  money,  time  and  perhaps  even  the  projects  themselves. 

Research  Constraints 

The  phase  of  software  Implementation  and  the  subsequent  software 
maintenance  and  modification  efforts  are  not  considered  in  this 
dissertation.  Implementation  and  maintenance,  however,  are  cer- 
tainly not  insignificant  resource  consuming  efforts.  For  example, 
the  International  Business  Machine  Corporation  (IBM),  Systems 
Development  Division,  found  that,  after  two  years  in  the  field,  76% 
of  the  total  cost  for  a particular  software  release  was  spent  on 
maintenance  as  opposed  to  the  24%  spent  on  development  [26].  A 
Department  of  Defense  study  showed  that  software  for  airborne  com- 
puters cost  up  to  $4000  per  instruction  over  the  lifetime  of  the 
system  (141).  However,  research  associated  with  estimating  the  cost 
of  maintenance  is  closely  related  to  the  problem  of  software  reli- 
ability and  is  not  within  the  purview  of  this  dissertation.  Further- 
more, the  Implementation  phase  has  also  been  excluded  because  there 
is  some  question  regarding  the  reliability  of  the  data  in  the  data 
source  used  for  this  research.  Sherman  acknowledged  the  importance 
of  system  Implementation  and  maintenance  as  follows:  "The  com- 
pletion of  system  testing  and  the  release  of  the  programs  for  live 
running  is,  to  paraphrase  a famous  speech,  ’Not  the  end  of  the  end, 
but  the  end  of  the  beginning'"  [134]. 

This  Is  not  another  study  to  evaluate  prograomer  productivity. 

It  is  suggested  that  it  is  the  productivity  of  analysts  and  managers 
which  should  be  examined,  if  in  fact  productivity  per  se  is  a sub- 
ject worthy  of  the  amount  of  investigation  given  to  it.  The  vari- 
ability of  definition  and  results  regarding  productivity  are 
addressed  more  thoroughly  in  Chapter  III, Literature  Survey.  Judith 
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Clapp  [30]  writes  that  evidence  has  shown  that  writing  code  is  only 
about  15  to  20%  of  the  total  cost  of  software  development  (15). 

Why  then  the  preoccupation  with  programmer  productivity? 

The  research  will  not  examine  project  cost  or  span  time  directly. 
In  the  data  base  being  used  the  total  duration  (span  time)  of  the 
project  is  not  simply  related  to  the  accumulation  of  the  hours 
planned  to  be,  or  actually  expended,  on  each  activity.  Personnel 
in  the  Air  Force  Data  Systems  Design  Center  were, more  often  than  not, 
assigned  to  tasks  in  several  projects.  For  example,  an  activity  es- 
timated to  consume  eight  hours  might  span  a week  or  more.  Furthermore, 
the  system  which  collected  the  estimates  and  expenditures  did  not 
accurately  keep  track  of  an  original,  overall  estimated  date  for 
project  completion.  It  is  therefore  not  propitious  to  analyze  the 
impact  of  estimating  accuracy  (in  development  activities)  on  the  ex- 
tension of  the  duration  of  a project,  using  these  data. 

Unfortunately,  cost  is  not  a data  element  in  the  project  con- 
trol system  providing  the  data  base.  However,  from  a positive  view- 
point, its  absence  prevented  it  from  obscuring  the  elementary  data 
elements  (hours  estimated  and  expended)  which  this  research  Intended 
to  analyze.  Adding  burden  rates,  overhead,  indirect  costs,  etc.,  and 
converting  hours  to  dollars  through  some  personnel  cost  table  can 
tend  to  obfuscate  the  elementary  act  of  estimating  activities. 

Research  and  Data 

Formalization  of  software  production  Implies  better  definition 
of  the  process  and  the  tools,  more  standardization,  and  greater 
control  [30,  6].  Dr.  Barry  Boehm  believes  that  not  having  formalized 
and  standardized  data  bases  forces  managers  to  rely  on  intuition  when 
making  decisions  on  software  development  planning.  Software  phenomena 
often  tend  to  be  counterintuitive.  Given  the  magnitude  of  the  risks 
of  basing  major  software  decisions  on  fallible  intuition  and  the 
opportunities  for  ensuring  more  responsive  software  by  providing 
designers  with  usage  data,  it  is  surprising,  says  Boehm,  how  little 
effort  has  gone  into  endeavors  to  collect  and  analyze  such  data.  One 


of  the  reasons  progress  in  software  project  management  is  slow  is  that 
it  is  just  plain  difficult  to  collect  good  software  data.  According 
to  Boehm  these  difficulties  included: 

1.  Deciding  which  of  the  thousands  of  possibilities  to 
measure. 

2.  Establishing  standard  definitions  for  "error,"  "test 
phase,"  etc. 

3.  Establishing  what  had  been  the  development  performance 
criteria. 

4.  Assessing  subjective  inputs  such  as  "degree  of  diffi- 
culty," "programmer  expertise,"  etc. 

5.  Assessing  the  accuracy  of  post  facto  data. 

6.  Reconciling  sets  of  data  collected  in  differently 
defined  categories.  [17] 

Dr.  Boehm  emphasizes  that  more  work  on  these  factors  is  necessary 
to  insure  that  future  software  data  collection  efforts  produce  at 
least  roughly  comparable  results.  However,  although  the  data  col- 
lection problem  is  difficult  does  not  mean  it  should  be  avoided. 

There  are  a few  quantitative  measures  available  for  evaluating 
software.  These  include: 

1.  Cost, 

2.  Speed, 

3.  Size  (number  of  instructions), 

4.  Effort  (man-months  or  hours), 

5.  Efficiency — dimensionless — a comparison  of  how  much 
hardware  is  used  compared  with  minimum  necessary,  and 

6.  Development  time.  [78] 

The  element  of  measurement  being  used  in  this  research  is  "effort," 
in  terms  of  a comparison  of  the  estimated  and  actual  values  reported 
on  resource  consuming  activities. 

J.  Farquhar  of  RAND  has  summarized  the  entire  issue  of  the  es- 
timation of  software  development  best  in  the  following  observation: 

Perhaps  a discussion  of  possible  research  should  close 
with  a restatement  of  the  importance  of  software  es- 
timation. It  seems  plausible  that  this  activity  lies  at 
the  very  heart  of  software  development  management.  Unable 
to  estimate  accurately,  the  manager  can  know  with  cer- 
tainty neither  what  resources  to  commit  to  an  effort 
nor,  in  retrospect,  how  well  these  resources  were  used. 
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(The  lack  of  a firm  foundation  for  these  two  judgements  can 

reduce  programming  management  to  a random  process  In  that 
positive  control  Is  next  to  Impossible.  This  situation 

(often  results  In  the  budget  overruns  and  schedule  slipp- 
ages that  are  all  too  common  today.  [49] 

I 

I 

I 
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CHAPTER  II 

THE  PROBLEM  IN  PERSPECTIVE 


Management  and  Estimation 

Adams,  In  his  dissertation  on  the  estimation  process,  has  focused 

on  the  manager  as  the  cynosural  el€?ment  of  this  process. 

The  manager  has  been  described  as  a specialist  in 
the  art  of  decision  making.  This  description  is 
particularly  apropos,  for  nearly  all  of  his  duties 
involve  making  decisions.  The  manager  first  iden- 
tifies and  defines  a group  of  alternatives  relevant 
to  the  current  problem.  Uncertainty  Is  present 
since  the  consequences  of  those  alternatives  lie 
in  the  future  and  must  be  anticipated,  so  the  prob- 
able results  of  each  alternative  are  estimated. 

The  manager's  ability  to  estimate  accurately.  In  the 
face  of  uncertainty,  is  thus  one  key  factor  which 
determines,  at  least  in  part,  his  skill  as  a decision 
maker  and  his  success  as  a manager.  [2] 

Software  development  management  will  continue  to  deserve  its 
current  poor  reputation  for  cost  estimating  and  schedule  effective- 
ness until  such  time  as  a more  complete  understanding  of  the  develop- 
ment process  is  achieved.  Systems  are  still  built  like  the  Wright 
Brothers  built  airplanes — build  the  whole  thing,  push  it  off  a cliff, 
let  it  crash,  and  start  all  over  again  [106].  Other  symptoms  of 
management  failure  include  schedule  overruns,  poor  quality  products 
and  failure  to  meet  project  objectives  (8,  121].  At  a National 
Conference  of  the  Association  for  Computing  Machinery,  George 
Weinwurm  cautioned  that: 

A new  technology  such  as  ours  cannot  continue  to  grow 
exponentially  so  long  as  it  is  dependent  on  artisans 
for  everything.  The  need  is  for  formalized  procedures 
and  standards  that  permit  reasonably  capable  and  well- 
trained  people,  with  the  benefit  of  such  a methodology, 
to  do  what  artisans  could  do  before — and  do  it  con- 
sistently. [154] 
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In  support  of  this  statement  J.  Farquhar  observed  that  this  need 
for  consistency  Is  nowhere  more  apparent,  or  more  painfully  felt, 
than  In  the  area  of  "software  estimation."  Without  some  rational 
methodology  for  software  estimation,  the  entire  programming  management 
activity  becomes  entirely  subjective — too  subjective  with  regard  to 
the  expense  of  the  human  and  material  resources  Involved  [49,  761. 
Jones  [72]  concurs  but  observes  that  objective  and  subjective  method- 
ologies reference  an  Idealized  dichotomy.  In  practice,  no  methodology 
Is  ever  likely  to  be  wholly  objective  nor  subjective. 

Throughout  this  study  the  words  estimator  and  manager  are  used 
Interchangeably.  Managers  are  responsible  for  estimates  but  they 
normally  do  not  make  them.  In  Adams'  research,  however,  the  es- 
timators were  the  managers  which  violated  the  tenet  that  the  Individ- 
ual responsible  for  a task  should  make  the  estimate  [2,  128). 

Estimation  Is  the  assessment  of  resources  needed  for  accom- 
plishing planned  objectives.  Practically,  software  estimates  take 
the  form  of  a judgement  of  the  amount  of  human  effort.  In  man-hours, 
required  to  produce  the  software  In  question.  Scheduling  is  the 
placement  of  resource  consuming  activities.  Intrinsic  to  the  ob- 
jectives, into  a time  frame.  The  activities  and  assigned  estimates 
are  among  the  most  Important  factors  influencing  a schedule  [150]. 

It  is  not  news  that  the  software  industry  Is  under  criticism  for 
frequent  and  spectacular  failures  In  providing  effective  and  reliable 
software  within  a reasonable  cost  envelope  and  time  frame  [105]. 

Kosy  has  reported  several  software  development  projects  which  have 
overshot  their  original  cost  estimates  by  wide  margins.  For  example, 
software  for  the  Federal  Aviation  Agency's  Air  Traffic  Control 
Center  Initially  estimated  at  $1.8  million,  had  cost  $19  million  as 
of  1970.  Developing  a replacement  for  one  of  the  subsystems  In  the 
U.S.  Air  Force  Air  Defense  Command's  496L  system  turned  out  to  cost 
nearly  seven  times  more  than  the  original  estimates  and  even  at  that 
price  was  never  fully  completed  [78].  The  problem  Is  not  new,  how- 
ever. Jones  [71]  points  out  that  many  experts  believe  that  software 
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IdevelopnenC  Is  not  so  different  from  other  unstructured  engineering 
tasks.  Indeed,  the  difficulty  in  making  accurate  estimates  and  the 
difficulty  in  preparing  cost  and  time  projections  are  the  same  kinds 
I of  problems  that  engineers  had  to  struggle  with  in  the  infancy  of 

their  science.  The  computer  industry  continues  to  forge  ahead  in 
I Improvements  to  hardware,  linguistics,  file  structures  and  operating 

system  capabilities  but  has  given  little  attention  to  ways  in  which 
I to  manage  the  other  essential  factor  in  every  automated  system — soft- 

ware development.  Great  attention  is  given  to  saving  nano-seconds  in 

(the  computer  and  too  little  attention  to  wasting  hundreds  of  man- 

years  in  developing  the  programs  which  are  essential  to  take  advan- 
tage of  the  hardware  speed. 

I Progress  in  software  development  management  can  only  be  based  on 

measurable  performance.  But  management  must  be  provided  a baseline 
f against  which  to  make  decisions.  Several  appropriate  measures  exist 

for  gauging  hardware  performance  (i.e.  cost/bit,  add  time,  transfer 
rate,  byte  capacity,  etc.)  but  software  has  few  standardized  meaningful 
metrics  (78].  Ronald  Smith  of  IBM  [138]  believes  that  the  understand- 
ing of  software  management  has  not  progressed  far  enough  to  know 
exactly  what  data  are  required,  when  they  are  necessary,  and  in  what 
form  they  are  needed  to  enable  a more  optimal  management  performance. 
The  key  to  successful  management  of  a software  project  is  valid 
Information  on  which  Intelligent  decisions  can  be  made.  Grossly  in- 
accurate estimates  serve  as  highly  misleading  decision  making  crite- 
ria. Smith  also  emphasized  that  the  importance  of  quality  management 
cannot  be  understated,  for  it  is  clear  that  management  has  a major 
effect  on  whether  a project  will  succeed  or  fail.  Failure,  of  what- 
ever kind,  in  software  systems  has  proven  to  be  very  costly. 

Crisis  in  Cost 

Much  of  the  literature  (158,  17]  addresses  itself  to  cost,  which 
is  certainly  the  most  significant  symptom  of  the  software  development 
■ dilemma.  While  each  new  processing  technique  or  hardware  gadget 
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keeps  the  market  place  thriving  and  lowers  the  cost  of  computing  per 
se — the  cost  to  develop  a system  is  expanding  exponentially  In  the 
opposite  direction. 

The  "Information  Processing/Data  Automation  Implications  of  Air 
Force  Command  and  Control  Requirements  in  the  1980's,"  or  CCIP-85, 
study  primarily  showed  that  for  almost  all  applications,  software 
(as  opposed  to  computer  hardware)  was  the  major  source  of  difficult 
future  problems  and  operational  performance  penalties.  However,  it 
was  found  to  be  difficult  to  convince  people  outside  the  software 
industry,  primarily  because  of  the  .scarcity  of  solid  quantitative 
data,'  to  demonstrate  the  impact  of  software  on  accurate  scheduling 
and  operational  performance  (17].  Cost  always  has  been  and  will  con- 
tinue to  be  the  most  significant  decision  making  criterion.  The 
rising  cost  of  software  is  creating  a crisis  in  the  industry.  Rosy 
has  neatly  summarized  the  historical  aspects  of  this  crisis: 

Requirements  for  software  functions  have  grown,  both  in 
kind  and  scope,  at  an  exponential  rate.  As  one  index 
of  that  growth,  the  amount  of  code  that  IBM  has  pro- 
vided as  standard  supervisor  and  utility  software  for 
its  line  of  computers  has  increased  by  a factor  of  1000 
over  the  last  15  years,  doubling  every  1.4  years.  Not 
coincidentally,  the  decreasing  cost  and  increasing 
power  of  computer  hardware  has  permitted  more  elaborate 
software  aggregates  to  become  feasible  and  brought 
larger,  more  capable  systems  into  common  use.  In  con- 
trast, the  methods  used  to  obtain  correct  and  reliable 
software,  to  specify  and  design  software  systems,  to 
make  and  validate  software  modifications,  and  to 
secure  classified  data  in  software  systems  have  not 
changed  significantly  over  the  same  period.  Despite 
the  technological  improvements  that,  by  our  estimate, 
have  Increased  the  productivity  of  programmers  on 
small  projects  by  nearly  a factor  of  six  over  the 
past  decade,  large-scale  state-of-art  systems  have 
taken  inordinate  efforts  to  produce,  and  the  effort 
has  increased  nonllnearly  the  bigger  the  project. 

More  often  than  not,  large  software  systems  are  de- 
livered late  and  cost  more  than  was  estimated.  [78] 

At  the  1968  NATO  Software  Engineering  Conference,  K.  Kolence  took 

exception  to  the  word  "crisis"  because  it  is  a very  emotional  word. 
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The  basic  problem,  according  to  Kolence,  Is  that  certain  classes  of 
systems  are  creating  demands  which  are  beyond  our  capabilities,  our 
theories,  methods  of  design,  and  production.  Kolence  emphasized, 
however,  that  many  areas  of  software  development  are  still  relatively 
straight-forward — l.e.  payroll  systems,  etc.  However,  D.  Ross  of  MIT 
responded  that  It  makes  no  difference  If  one's  arms,  legs,  brain  and 
digestive  tract  are  all  in  fine  working  order  If  one  Is  at  the  moment 
suffering  from  a heart  attack.  One  Is  still  very  much  In  a crisis 
(106).  No  discussion  of  the  estlimition  problem  can  be  complete  with- 
out acknowledgement  to  the  significance  of  the  cost  symptom  which  the 
Industry  seems  unable  to  control. 

Management  Pays  the  Price 

The  CCIP-85  study  revealed  the  following,  somewhat  staggering, 
comparative  information  regarding  software  costs.  For  the  Air 
Force,  the  estimated  dollars  for  FY  1972  annual  expenditure  on 
software  was  between  $1  billion  and  $1.5  billion,  about  three  times 
the  annual  expenditure  on  computer  hardware,  and  about  4%  to  57.  of 
the  total  Air  Force  budget.  Similar  figures  hold  elsewhere.  The 
recent  World  Wide  Military  Command  and  Control  System  (WWMCCS) 
computer  procurement  was  estimated  to  involve  expenditures  up  to 
$100  million  for  hardware  and  over  $700  million  for  software.  A 
recent  estimate  for  NASA  was  an  annual  expenditure  of  $100  million 
for  hardware,  and  $200  million  for  software — about  6%  of  the  annual 
NASA  budget. 

For  some  Individual  projects,  overall  software  costs  are: 

IBM  OS/360  $200,000,000, 

SAGE  $250,000,000,  and 

Manned  Space  Program,  1960-70  $1,000,000,000. 

Overall  software  costs  in  the  O.S.  are  probably  over  $10  billion  per 
year,  over  IX  of  the  gross  national  product.  Software  expenditures 
in  the  Air  Force  are  estimated  to  be  going  to  over  90%  of  total  ADP 
system  costs  by  1985;  tills  trend  is  probably  characteristic  of  other 
organizations  also  |17|. 
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If  the  software-hardware  cost  ratio  appears  lopsided  now  (refer  to 
I Figure  1),  consider  what  will  happen  in  the  years  ahead,  as  hardware 

gets  cheaper,  software  systems  become  larger  [125],  and  costs 
(people)  go  higlier  and  higher.  Thayer  [146]  wisely  observes  that 
software  development  is  labor  intensive. 

Fig.  1.  Trend  comparison  of  computer  system  cost  distribution  in  a 
typical  company.  [80] 


At  the  1970  ACM  Conference,  Dr.  Barry  Boehm  stated  that  dollar 
costs  for  software  are  in  the  millions  because  of  project  slippages, 
budget  overruns,  and  what  software  managers  traditionally  call 
"agonizing  reappraisal."  That  is  the  point  where  management  realizes 
the  software  won't  work  and  that  large  pieces  of  the  system  have  to  be 
redesigned  [15]. 

Is  this  cost  unimportant?  Is  it  unnecessary  that  software 
managers  be  able  to  plan,  organize,  direct  and  control  the  efforts 
of  systems  development  personnel  toward  efficient  realization 
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of  the  objective — a system  which  provides  accurate,  timely  output  when 
it  was  planned  to  have  it?  Is  the  objective  worthwhile  at  any  cost? 

It  is  suggested  that  the  answer  to  these  questions  is  a resounding  NO! 
Sheman  also  believes  that  it  is  not  sufficient  to  accomplish  the 
technical  objectives  of  the  project  while  paying  little  regard  to  its 
overall  coet  [134].  The  industry  must  begin  to  look  for  ways  to  not 
only  reduce  these  costs,  but  to  be  able  to  anticipate  and  control 
them  better.  For  example,  development  of  Tactical  Command  and  Control 
Software  may  require  125  to  3330  man-years  and  from  $7.5  to  $200 
million,  depending  on  the  size  and  difficulty  of  the  system  [78].  The 
expanded  RAND  Corporation  version  of  the  CCIP-85  study  emphasized  that 
budget  controls  will  require  minimizing  the  discrepancy  between  actual 
and  estimated  software  costs  to  5Z  to  lOZ  at  most.  The  RAND  study 
also  points  out  that  this  is  not  only  a problem  of  cost  control  but 
also  of  arriving  at  accurate  estimates  in  the  first  place.  While  cost 
estimates  rely  heavily  on  human  judgement  and  probably  cannot  be  re- 
duced to  mechanical  procedures,  they  should  be  based,  as  much  as 
possible,  on  quantitative  relationships  and  data  to  minimize  sub- 
jectivity [78].  But  cost  is  not  the  underlying  problem  in  software 
development — management  is;  and  particularly  the  management  function 
of  estimating  [154,  25]. 

The  amount  of  available  empirical  analysis  on  estimating  com- 
puter programming  projects  is  so  limited  that  rules  of  thumb  are  sub- 
stituted for  quantitative  measures.  Speculation  is  substituted  for 
understanding  of  the  phenomena  which  influence  estimates.  It  is 
suspected  that  herein  lies  the  reason  that  the  basic  problem  of 
software  development  management  is  continually  avoided,  and  that  so 
little  is  known  of  the  factors  directly  bearing  on  this  problem.  It 
seems  that  in  the  absence  of  empirical  analysis  the  industry  is  limited 
to  working  out  solutions  to  cure  the  symptoms  of  the  problem.  The 
only  statement  that  has  been  made  about  system  development  with  ab- 
solute certainty  is  that  large  systems  require  more  resources  and 
time  than  small  systems.  Beyond  this  truism  there  is  only  con- 
jecture [116]. 
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In  his  book.  The  Nature  of  the  Physical  World,  Sir  Arthur 
Eddington  observed; 

We  often  think  that  when  we  have  completed  our  study  of 
"one"  we  know  all  about  "two,"  because  "two"  Is  "one  and 
one"!  We  forget  that  we  still  have  to  make  a study  of 
"and."  Secondary  physics  Is  the  study  of  "and" — that  Is 
to  say,  of  organization.  [A8] 

Organization  In  Software  Development 

Two  recent  management  techniques  which  come  closer  to  bridging  the 
gap  between  concern  for  the  symptoms  and  analysis  of  the  underlying 
problems  are  both  organizational  in  nature.  One  technique  Is  IBM's 
Chief  Programmer  Teams,  and  the  other  is  structured  programming. 

The  development  of  these  techniques  has  been  motivated  by  a 
desire  to  reduce  the  cost  of  developing  and  maintaining  software  by 
reducing  a program's  complexity  and  Increasing  Its  clarity.  The  high 
cost  of  programming  Is  due.  In  large  measure,  to  the  complexity  of 
the  programs.  As  a result  of  this  complexity,  the  program  develop- 
ment process  Is  characterized  by  a large  number  of  mistakes  and  a 
great  deal  of  waste  and  rework.  According  to  Terry  Baker  there  Is 
a persistent  myth  that  programming  consists  of  a little  strategic 
thinking  at  the  top  (program  design),  and  a lot  of  coding  at  the 
bottom.  But  one  small  statistic  is  sufficient  to  explode  this  mis- 
understanding. Including  all  overhead,  five  to  ten  debugged  In- 
structions are  coded  per  man-day  on  a large  production  programming 
project  [10). 

The  problems  of  timeliness  and  reliability  are  problems  of 
organization  as  well  as  technology.  To  address  this,  Mills  and 
Baker  at  IBM  developed  a programming  organization  called  a Chief 
Programmer  Team  [10].  Canning  has  described  a programming  management 
procedure  used  at  Lockheed  Corporation  prior  to  1968  which  closely 
resembles  the  Chief  Programmer  Team  concept  [27]. 

A Chief  Programmer  Team  represents  a new  managerial  approach  to 
production  programming.  These  changes  Include  restructuring  the 
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work  of  progrannning  into  specialized  Jobs,  defining  relationships 
among  specialists,  developing  new  tools  to  permit  these  specialists 
to  Interface  effectively  with  a developing,  visible  project;  and 
providing  for  training  and  career  development  of  personnel  within  these 
specialities.  A significant  aspect  of  the  Chief  Progranmer  Team 
approach  is  to  minimize  system  interactions  by  concentrating  all  design 
decisions  on  one  man.  If  this  aspect  is  lost,  the  group  rapidly  con- 
verts into  a standard  pyramid  type  organization,  structured  to  control 
rather  than  minimize  interactions  [6]. 

The  second  discipline,  structured  programming,  defines  a top- 
down  sequence  for  program  unit  creation,  testing,  and  a technical 
standard  for  the  coding  of  each  unit.  Structured  progranraing  is  a 
manner  of  organizing  and  coding  programs  that  makes  the  programs 
easily  understood  and  modifiable.  Easy  modification  in  turn  permits 
easy  maintenance  of  the  product  and  easy  building  of  a new  product 
using  the  original  product  as  a base.  Structured  programming  is  an 
evolving  discipline  which  has  as  its  objective  more  reliable  software 
at  less  cost  and  time  [29].  Much  has  been  written  about  structured 
programming  in  the  last  couple  of  years  and  its  definition  varies  from 
writer  to  writer.  However,  the  fundamental  message  is  "simplify  the 
program  control  paths"  [100,  11]. 

These  are  developments  that  could  revolutionize  programming  in 
several  ways.  The  most  obvious  benefits  are  Increased  productivity 
and  reduced  error  rates.  An  analogy  has  been  made,  which  the  hard- 
ware people  have  known  for  years,  that  any  logic  circuit  can  be  made 
up  from  a few  basic  primitives,  such  as  the  "AND"  or  "OR"  operations. 
Perhaps  programming  is  approaching  something  of  the  same  maturity. 

Analysis  of  the  individual  components  (Analysis,  Programming, 
etc.)  involved  in  software  development  may  be  the  best  way  to  under- 
stand and  optimize  the  management  of  each  of  these  phases.  Programming 
itself,  as  opposed  to  Analysis  or  Testing,  is  probably  the  easiest  and 
most  tangible  component  to  experiment  with  and  study.  Nevertheless, 
research  must  begin  to  come  to  grips  with  the  ineffective  planning, 
estimating  and  scheduling  of  the  larger  and  more  abstract  phases 
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(Analysis,  Design,  Testing,  etc.)  of  the  software  development  process 
The  Interrelationships  of  these  phases  must  also  be  examined.  These 
enigmatic  aspects  of  the  estimating  problem  will  have  to  be  Investi- 
gated If  new  projects  are  to  be  completed  In  a reasonable  amount  of 
time,  and  for  the  total  process  to  evolve  as  a science. 


Reasons  for  Poor  Estimating 

The  renowned  Dr.  Frederick  Brooks  emphasizes  that  more  software 
projects  have  gone  awry  for  lack  of  calendar  time  than  for  all  other 
causes  combined.  Brooks  offers  five  reasons  why  this  cause  of  dis- 
asters Is  so  common. 

First,  our  techniques  of  estimating  are  poorly  de- 
veloped. More  seriously,  they  reflect  an  unvoiced  as- 
sumption which  is  quite  untrue,  l.e.  , that  all  will  go 
well. 

Second,  our  estimating  techniques  fallaciously 
confuse  effort  with  progress,  hiding  the  assumption 
that  men  and  months  are  Interchangeable. 

Third,  we  are  uncertain  of  our  estimates.  We 
let  the  urgency  of  the  patron  color  (perhaps  even 
govern)  our  estimate  of  how  long  the  project  will 
take . 

Fourth,  schedule  progiass  is  poorly  monitored. 

Techniques  proven  and  routine  In  other  engineering 
disciplines  are  considered  radical  Innovations  in 
software  development. 

Fifth,  when  schedule  slippage  is  recognized, 
the  natural  (and  traditional)  response  is  to  add  man- 
power. Like  dowsing  a fire  with  gasoline,  this  makes 
matters  worse.  More  fire  requires  more  gasoline,  and 
a regenerative  cycle  begins  which  ends  in  disaster. 

[22] 

Jones  believes  that  historically,  the  most  Important  reason  for 
poor  cost  estimates  Is  that  the  system  configuration  changes  sub- 
stantially from  the  time  the  cost  estimate  Is  made  until  the  time 
the  system  becomes  operational  [72].  Conversely,  Gildersleove  has 
found  that  excessive  estimate  overruns  are  less  a result  of  poor 
estimating  than  they  are  of  some  combination  of  seventeen  (17)  various 
management  failures  [59,  74]. 

The  stories  of  resource  estimates  being  missed  by  factors  of  two 
or  three  are  too  numerous  to  be  discounted  as  exaggerated  rumor. 
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Brooks  cautions  that  dangerous  reasoning  Is  expressed  In  the  very  unit 
of  effort  used  In  estimating  and  scheduling,  the  man-month.  Cost  does 
Indeed  vary  as  the  product  of  the  number  of  men  and  the  number  of 
months.  Progress  does  not!  Hence  the  man-month,  as  the  only  unit  for 
measuring  the  size  of  a job,  is  a deceptive  myth.  It  implies  that  men 
and  month.*  are  interchangeable  [22]. 

It  Is  also  unrealistic  to  put  managers  under  pressure  to  minimize 
both  elapsed  time  and  resources  In  software  development  projects.  In 
these  Instances  the  manager  is  not  being  asked  to  estimate  and  schedule, 
bu^  merely  to  provide  numbers  that  fit  within  arbitrarily  pre-defined, 
and  probably  unrealistic,  limits.  False  scheduling  to  satisfy  a user's, 
or  top  management's  desired  completion  date  Is  much  more  common  in  the 
software  engineering  discipline  than  in  other  areas  of  science.  The 
problem  is  that  It  Is  very  difficult  for  a manager  to  make  a convinc- 
ing and  job-risking  defense  of  an  estimate  that  Is  derived  by  no 
quantitative  method,  supported  by  little  data,  and  certified  by  "rules 
of  thumb"  [22]. 

A New  Perspective 

At  least  two  approaches  are  necessary  in  attacking  the  problem  of 
accurate  software  estimation.  The  first  is  the  more  pragmatic:  it  is 
necessary  to  collect  data  on  software  project  estimates  and  expendi- 
tures as  they  occur.  These  data,  and  the  experiences  associated  with 
software  development  projects  must  be  shared.  However,  even  these 
simple  steps  would  involve  a considerable  expenditure  of  money,  a 
serious  commitment  to  the  effort,  and  the  courage  on  the  part  of  man- 
agement to  admit  to  the  errors  made  during  the  development  cycle. 
"Egoless  management"  so  to  speak,  is  probably  a dichotomy  of  terms 
and  is  too  much  to  hope  for.  Unfortunately,  software  managers  who 
must  report  actual  expenditures  and  failures  which  occur  during  de- 
velopment believe  that  this  information  will  be  used  against  them  [30]. 
Nevertheless,  Toffler's  philosophy  teaches  that  the  imagination  Is  only 
free  whan  fear  of  error  Is  temporarily  laid  aside  [34].  Two  recent 
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publications  which  reflect  on  the  mistakes  made  during  software 
I development  are  The  Mythical  Man-Month  by  Dr.  Frederick  Brooks  [23] 

and  "The  MUDD  Report"  by  David  Weiss  [155].  Boehm  suggests  it  is 
I worth  working  on  a mechanism  for  transferring  government  software  ex- 

perience into  the  commercial  sector  [15].  The  second  approach  that 
is  needed  is  for  government  and  Industry  to  support  and  conduct  basic 
research  using  the  data  collected.  The  problem  with  these  two  approach- 
es is:  where  shall  the  data  come  from?  Furthermore,  estimating  ac- 
curacy is  not  the  only  function  of  the  estimating  process  [9]. 

Project  Control  in  Perspective 

Hountlng  expenditures  in  software  development  are  no  longer 
matched  by  rising  economic  returns  from  that  software.  Although  most 
of  the  statements  which  allude  to  this  state  of  affairs  come  from 
profit  seeking  organizations,  the  same  general  problems  are  known  to 
exist  in  other  organizations — notably  governmental  activities  [60]. 

Software  projects  can  be  structured  and  controlled.  There  are 
certain  activities  which  must  be  performed  in  any  software  development 
project.  There  is  also  a natural  sequence  for  performing  many  of 
these  activities.  If  a software  development  project  is  properly  de- 
fined and  structured,  management  has  the  techniques  and  tools  avail- 
able to  control  the  efficient  accomplishment  of  the  project  [38,  95, 
159]. 

Project  control  can  have  some  beneficial  effect  on  insuring 
that  software  project  costs,  project  duration,  and  (perhaps  sur- 
prisingly) project  satisfaction,  are  performed  near  the  limits  speci- 
fied in  the  project  plan.  The  purpose  of  project  control  is  to  make 
It  easier  for  management  to  identify  errors  or  potential  errors,  and  to 
take  action  to  remedy  them.  A project  will  not  succeed  because  of  pro- 
ject control  but  it  is  unlikely  to  succeed  without  it  [132,  87,  lA,  93], 
Control  is  a state  of  mind  that  recognizes  a commitment  to  achieve  cer- 
tain objectives.  Control  is  also  the  result  of  that  commitment.  A 
manager  makes  assumptions  (estimates),  then  the  control  system  status 
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reports  compare  these  assumptions  with  how  the  environment  Is  Inter- 
acting with  the  assumptions  [21,  99].  The  problem  central  to  the 
control  and  reporting  of  status  Is  the  nature  of  progress  (133,  157]. 
It  Is  clear  that  with  software  development  progress  the  final  product 
cannot  be  simply  described  as  the  sum  of  the  Individually  completed 
activities  presently  used  to  define  the  process.  Nevertheless,  In 
large  development  organizations  the  collection  and  analysis  of  project 
control  data  should  also  have  a long  range  benefit  In  terms  of  factual 
experience  accumulated.  Unfortunately,  an  organization  having  a 
superior  system  for  project  control  but  without  the  other  Ingredients 
of  a good  project  management  system  Is  guilty  of  suboptimization  of 
a most  serious  kind. 

A project  control  system  Is,  after  all,  just  one  major  link  In 
the  chain  associated  with  managing  software  development  projects.  It 
Is  also  Important,  to  obtain  measurable  evidence  (l.e.  data  system 
specifications)  to  establish  that  activities  were  completed  [39]. 

Such  evidence  also  enhances  quality  control  since  these  products  can 
then  be  evaluated.  This  Implies  some  type  of  broad  project  control 
policy. 

It  would  be  misleading  to  suggest  that  project  control  Is  the 
major  problem  In  the  software  management  area.  According  to  some 
experts  project  control  Is  not  even  a major  problem.  As  evidence, 
consider  the  following. 

In  September  1969,  the  Founding  Conference  of  the  Society  for 
Management  Information  Systems  was  held  at  the  University  of  Minnesota 
[42].  Attendees  at  the  conference  were  asked  by  the  University's 
Management  Information  Systems  Research  Center  to  complete  a question- 
naire concerning  several  topics.  Most  relevant  Is  a question  con- 
cerning the  factors  In  an  organization's  environment  which  are  associ- 
ated with  software  project  success  or  failure.  The  respondents  rated 
the  overall  Importance  to  project  success  of  thirty  four  factors. 

Table  1 shows  how  those  factors  related  to  project  control,  compare 
relative  to  other  factors. 
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would  be  better  to  direct  research  toward  general  factors  associated 
with  project  success  and  failure  than  toward  project  control,  per  se. 
One  relationship  which  suggests  itself  is  that  it  is  the  lack  of 
adequate  management  information  systems  management  that  is  a very 
great  force  retarding  our  progress  toward  the  solution  of  the  pro- 
blem of  collecting  and  analyzing  data  and  improving  estimating 
accuracy. 

For  many  years,  software  development  projects  and  data  processing 
managers  have  been  allowed  to  exist  in  an  environment  in  which  there 
has  been  no  way  of  measuring  their  performance.  Data  processing 
managers  have  therefore  not  been  held  accountable  for  management 
actions.  In  other  words,  software  development  has  been  operating 
out  of  control  for  a considerable  time.  One  way  to  correct  this 
situation  Is  to  require  the  software  manager  to  develop  software 
projects  within  a control  system  similar  to  those  applied  to  other 
functional  managers  [143].  Once  the  process  of  software  development 
is  measurable  and  visible,  it  follows  that  managers  and  estimators, 
as  a matter  of  self-interest,  will  adopt  better  managerial  and  es- 
timating practices.  Standardized  estimating  techniques  will  evolve 
and  data  will  be  collected  and  analyzed  about  the  factors  which  in- 
fluence estimating  and  decision  making  relative  to  the  software 
development  process.  Project  control  is  significant,  therefore,  in 
that  it  is  the  best  technique  for  obtaining  the  measurable  data 
that  are  required  for  research. 


CHAPTER  III 


LITERATURE  SURVEY 


An  Old  Problem 


Since  the  early  1960's  planning  and  controlling  the  cost  and 
duration  of  software  development  projects  have  been  the  foremost  man- 
agement problems  in  Computing  Science . Estimating  resource  requirements 
is  the  one  activity  accomplished  in  planning  which  is  least  understood 
and  least  controlled. 

The  following  paragraphs  are  excerpts  from  the  introduction  at  the 
first  symposium  on  the  evolving  problems  of  managing  the  development 
of  computer  program  systems.  Though  written  thirteen  years  ago,  it 
would  seem  that  these  remarks  are  still  particularly  appropriate  to 
the  situation  existing  today. 

In  our  concern  with  the  day-to-day  management  prob- 
lems of  system  development,  we  sometimes  forget  how  young 
the  software  industry  is.  Little  more  than  ten  years  ago, 
one  man  could  design,  code,  and  test  a typical  computer 
program.  Since  that  time,  however,  programming  has  in- 
creased vastly  in  scope  and  complexity.  In  order  to 
develop  many  of  today’s  large-scale  computer  applications, 
it  is  necessary  to  segment  computing  problems  and  arrive 
at  the  desired  solutions  through  the  concentrated  efforts 
of  many  designers  and  programmers. 

As  the  dimensions  of  computer  programming  have  in- 
creased, the  normal  activities  of  management  have  corres- 
pondingly become  more  difficult.  The  software  development 
manager  at  the  present  time  operates  in  an  environment 
of  changing  requirements,  limited  time,  too  few  qualified 
personnel,  and  often,  untried  equipment.  Most  of  a manager's 
energy  is  thus  spent  in  an  effort  to  get  the  job  done. 

To  our  knowledge  no  previous  meeting  has  addressed 
itself  specifically  to  the  management  process  in  the  develop- 
ment of  large  computer  programming  systems.  There  are,  we 
think,  two  reasons  why  there  has  not  been  more  concern  about 
software  management.  First  our  heritage  stems  from  the 
areas  of  engineering  and  scientific  computation  where  com- 
puter programs  are  prepared  on  a relatively  small  scale, 
and  where  management  problems  do  not  approach  in  breadth  or 
depth  those  that  mutt  be  faced  in  developing  large  computer 


program  systems.  Second,  we  have  failed  to  consider  criti- 
cally our  management  practices.  We  seem  to  have  been  so 
concerned  with  getting  the  Job  done  that  we  seem  to  have 
spent  far  too  little  time  in  planning  and  developing  good 
management  techniques.  It  Is  our  conviction  that  we  can 
improve  our  management  methods  If  we  are  willing  to  Invest 
the  necessary  time.  One  of  the  symposium  participants  ex- 
pressed the  urgency  of  this  notion  very  well  when  he  said 
that  we  must  provide  ourselves  with  an  Information-handling 
capability  no  less  adequate  than  the  systems  we  build  for 
other  people.  [105] 

The  symposium  participants  saw  nothing  abstract  In  the  difficul- 
ties of  producing  a large  program  system.  To  them  the  complex  reali- 
ties of  slipped  schedules  and  the  chronic  shortage  of  experienced 
programmers  posed  concrete  problems  requiring  hardheaded  solutions. 
This  pragmatic  outlook,  which  was  evident  in  all  of  the  symposium 
discussions,  was  neatly  summed  up  by  an  unidentified  speaker  who 
pointed  out  that  software  development  managers  are  faced  with  a 
condition,  not  a theory.  In  large  part,  the  condition  is  the  result 
of  the  imnaturity  of  the  software  Industry.  However,  other  factors 
also  contribute  to  this  condition.  Among  these  are  changing  cus- 
tomer requirements,  accelerated  delivery  schedules,  and  the  inter- 
face problems  between  the  hardware  and  software  components  of  a 
system.  Other  speakers  expressed  similar  views.  For  example.  If 
managers  could  exercise  better  control  over  their  management  environ- 
ment, many  development  problems  would  not  be  so  acute.  Furthermore, 
greater  environmental  stability  would  help  to  Improve  the  reliability 
of  estimates  and  forecasts,  and  the  quality  of  software  production. 

The  symposium  participants,  on  the  other  hand,  made  it  clear 
that  the  condition  is  also  the  product  of  shortcomings  in  management 
itself.  They  agreed  that  to  a large  extent  development  managers 
themselves  are  to  blame  for  the  severity  of  many  problems.  The 
chief  criticisms  stated  by  the  participants  were  that  development 
managers  do  not  construct  efficient  management  controls,  and  that  they 
are  apparently  unwilling  to  devote  the  time  and  resources  required  for 
long-range  planning  [35].  The  prograranlng  profession  traditionally 
has  been  deficient  In  establishing  planning  factors  for  the  control  of 
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the  development  process.  Managers  often  grossly  underestimate  the 
I size  and  complexity  of  a system  and  then  find  themselves  faced  with 

serious  production  delays  and  heavy  cost  overruns. 

Nelson  [110]  believes  that  basic  principles  of  management  exist 
which  are  applicable  to  all  forms  of  coordlnative  activity.  Manage- 
ment may  be  defined  as  the  process  of  accomplishing  objectives  by 
. establishing  an  environment  favorable  to  performance  by  people 

operating  in  organized  groups.  The  essence  of  managing  is  the 
i attainment  of  coordination  or  harmony  of  individual  effort  toward 

the  achievement  of  group  goals.  The  management  process  may  be  said 
I to  consist  of  the  performance  of  specific  functions,  namely:  plan- 

ning, organizing,  staffing  and  assembling  resources,  directing,  and 
I controlling.  These  functions  also  apply  to  the  management  of  com- 

puter programming  projects. 

One  universal  management  principle,  for  example,  has  been  called 
the  "principle  of  the  primary  of  planning"  [77].  That  is,  planning 
is  the  primary  requisite  to  the  other  managerial  functions  of  or- 
ganizing, staffing,  directing  and  controlling.  This  means  that  the 
degree  of  control  over  a programming  project  can  be  no  greater  than 
the  extent  to  which  adequate  plans  have  been  made  for  the  project.  It 
can  be  less,  of  course,  since  contingencies  can  force  modifications 
in  even  the  best  laid  plans;  but  the  extent  of  planning  sets  the 
degree  of  control  that  is  possible.  Furthermore,  Nelson  suspects 
that  inadequate  planning  is  the  primary  reason  for  loss  of  control 
on  many  computer  programming  projects.  It  is  not  the  comparative 
newness  of  the  computer  programming  process,  difficulties  with 
programmers,  nor  technical  factors — it  is  simply  that  programming 
projects  are  not  adequately  planned  in  the  first  place  [110,  111]. 

New  Research 

In  1974  Che  MITRE  Corporation  was  contracted  by  the  U.S.  Air 
Force  to  research  the  problem  of  the  engineering  of  quality  software 


systems.  Among  the  considerations  In  this  research  completed  In 
January  1975,  were: 

1.  Effects  of  management  philosophy  on  software  production,  and 

2.  Measurement  of  the  complexity  of  computer  software  [31]. 

In  the  Introduction  to  this  study,  Judith  Clapp  reasoned  that  the 
category  of  quality  analysis  Is  just  as  essential  to  quality  soft- 
ware engineering  as  to  the  general  area  of  design  and  Implementation. 
Without  quantitative  measures  of  software  the  goal  of  quality  soft- 
ware is  futile.  The  collection  and  analysis  of  data  about  software 
development  (underline  added)  are  as  critical  as  development  of  the 
software  Itself.  Data  are  essential  ingredients  In  any  engineering 
activity,  but  this  fact  has  been  neglected  with  respect  to  software. 
Ms.  Clapp  also  emphasized  that  the  conduct  of  experiments  in  software 
production  la  a worthy  subject  for  research  Itself,  and  that  other 
methods  for  evaluating  new  techniques  appear  to  be  necessary.  One 
area  requiring  further  research  concerns  methods  for  collecting  and 
analyzing  data  about  both  software  development  costs,  and  failures 
and  errors  during  software  development  [30]. 

It  Is  very  difficult  to  estimate  the  time  and  cost  necessary  to 
produce  a computer  software  package.  This  difficulty  stems  from  a 
poor  understanding  of  the  production  process,  the  number  of  dis- 
parate factors  affecting  program  complexity,  and  the  limited  ability 
of  managers  to  assimilate  and  effectively  weigh  these  factors  [49]. 
Confortl  [33],  has  listed  what  he  believes  are  barriers  for  the 
development  of  simplified  project  estimation  techniques. 

1.  Managers  themselves.  Managers  realize  that  systems  work 
is  somewhat  subjective,  and  feel  they  cannot  develop  an 
"accurate"  method  of  easily  (underline  added)  estimat- 
ing projects.  There  is  an  overconcern  for  accuracy. 

2.  Upper  management's  preoccupation  with  back-up  material. 
Operating  in  a petty  mode  where  every  man-hour  must  be 
explained  requires  a detailed,  complex  process  for 
developing  estimates. 
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3.  The  progranuner  himself.  He  views  his  Job  as  a highly 
creative  task,  as  opposed  to  making  an  objective 
examination  of  a job  with  the  same  basic  components 
used  over  and  over  again. 

Unfortunately,  Confortl  minimizes  the  impact  of  his  accurate  obser- 
vations above  by  stating  that  since  general  time  standards  are  known 
for  the  basic  components  of  any  program,  the  task  of  project  estim- 
ation is  quite  easy.  There  appears  to  be  an  inconsistency  in  this 
conment.  While  estimating  standard  components  of  a program  may  be 
relatively  easy — estimating  a software  development  project  is  very 
difficult  and  complex.  Studies  concerning  software  estim.Htlons 
have  identified  at  least  ninety  factors  that  affect  the  overall 
cost  of  just  program  development  [51].  Jackson  states  that  it  is 
difficult  to  estimate  how  long  it  vlll  take  to  write  a program  and 
believes  that  as  yet  there  is  not  any  satisfactory  way  of  making 
these  estimates  [70]. 

A MITRE  study  [30]  concluded  that  prior  to  the  start  of  soft- 
ware development,  management  must  decide  what  time  and  resources 
will  be  allocated  [34].  Resources  include  money,  people  and  com- 
puter time.  Elapsed  time  is  another  important  resource  parameter 
identified  in  much  of  the  literature  [137].  Reliable  estimating  is 
an  important  management  function  and  is  necessary  to  reduce  the 
uncertainty  inherent  in  software  development  plans  and  to  support 
evaluation  of  performance  of  software  projects. 

In  1974,  Morin  [101]  at  the  University  of  North  Carolina, 
researched  the  Estimation  of  Resources  for  Computer  Programming 
Projects.  She  described  the  study  as  an  effort  to  draw  together 
what  is  known  about  the  development  of  accurate  or  reliable  tech- 
niques for  estimating  manpower,  computer  resources  and  elapsed  time 
for  a computer  programming  project.  This  work  was  primarily  a 
survey  and  evaluation  of  the  literature  on  resource  estimation  for 
computer  programming  projects.  Working  under  Dr.  Frederick  Brooks, 
a recognized  expert  on  software  development  [23],  the  researcher 
classified  the  literature  into  three  convenient  categories: 
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1.  Studies  describing  estimating  methods t 

2.  Papers  describing  estimating  guidelines,  and 

3.  Literature  discussing  the  estimating  problem. 

The  first  category  consists  of  the  most  relevant  information  and 
previous  research,  including  some  specific  techniques.  The  second 
category  discusses  the  problem  and  offers  suggestions  from  experience 
and  proposes  general  rules  of  thumb.  The  last  category  deals  with 
general  discussions  of  programming  management,  with  estimation  dis- 
cussed as  one  of  the  many  facets  of  the  problem.  The  research  of 
the  literature  in  this  dissertation  will  adhere  closely  to  Morin's 
taxonomy  of  the  literature. 

Estimating  Methods 

Estimating  methods,  depending  upon  the  type  of  data  available, 
can  be  further  classified  into  several  techniques.  Nelson  [112] 
provides  four  categories,  based  on  the  primary  calculations  required 
to  develop  the  estimate.  These  categories  are: 

1.  Specific  Analogy.  Using  known  costs  of  similar  projects 
to  determine  estimates  for  resources  of  a proposed  system. 

2.  Unit  Price.  Multiplying  a previously  determined  cost  per 
unit  for  a given  resource  by  the  number  of  units  to  be 
delivered  in  a proposed  project. 

3.  Percent  of  Other  Item.  Setting  the  cost  of  a part  of  a 
proposed  project  as  a predetermined  fraction  of  another 
part. 

4.  Parametric  Equations.  Using  an  equation  that  represents 
the  cost  of  a proposed  project  as  a function  of  various 
characteristics  of  resources  expected  to  be  used  under 
the  anticipated  working  conditions. 

Lecht  [84]  has  described  a representative  and  workable  Specific 
Analogy  estimating  method.  This  method  almost  entirely  depends  upon 
the  use  of  cost  data  from  similar  projects  to  make  estimates  con- 
cerning the  cost  of  the  proposed  system.  Furthermore,  this  technique 
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uses  past  experience  of  the  estimator,  but  Is  generally  restricted 
to  the  scope  and  size  of  projects  previously  produced.  Where  re- 
sources are  extrapolated  for  larger  projects,  a linear  relationship 
Is  generally  assumed  between  program  size  and  costs.  However,  there 
Is  considerable  evidence  which  contradicts  the  assumption  of  linear- 
ity [158,  101].  Other  Specific  Analogy  methods  have  been  described 
by  Krauss  [79]  and  Hurtado  [68].  An  Interesting  on-line  "conver- 
sational" visual  display  technique  for  estimating  the  cost  to  manu- 
facture a product  has  been  described  by  Kelly  [75].  This  experi- 
mental approach  could  have  application  to  software  development  es- 
timation by  standardizing  and  simplifying  the  process,  and  by  having 
the  experience  data  readily  accessible  [91]. 

Estimation  of  cost  by  government  cost  analysts  has  traditionally 
relied  heavily  on  the  costs,  actual  or  estimated,  of  prior,  contem- 
porary or  projected  analogous  systems.  Guidance  for  using  experts 
as  a data  source  for  cost  estimating  Is  not  well  documented  In  pub- 
lished literature  according  to  Jones.  Cost  analysts  rely  almost 
exclusively  on  their  intuitive  judgements  in  tapping  such  expertise 
[73]. 

Estimating  methods  which  could  be  classified  under  Specific 
Analogy  Include  some  of  the  automated  project  management  systems. 
These  systems  provide  the  manager  with  some  assistance  In  estimating, 
planning  and  scheduling.  Project  management  systems  also  provide  a 
data  base  of  a project’s  progress  and  costs  as  compared  to  the 
effort  which  had  been  planned.  These  type  systems  Include  the  U.S. 
Air  Force's  Planning  and  Resource  Management  Information  System 
(PARMIS)  [148],  System  Development  Corporation's  "Software  Factory" 
[20],  and  many  others  [149,  102,  7,  92]. 

Myers  [104]  suggests  that  this  "experience"  method  Is  frequently 
unreliable  due  to  the  following  factors: 

1.  The  relationship  between  cost  and  system  size  (number  of 
program  statements)  Is  not  linear. 

2.  Projects  with  similar  names  (l.e.  payroll  system)  are 
often  very  dissimilar  In  terms  of  development  coats. 
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3.  Manipulations  by  management  to  avoid  overruns  make 
I historical  data  questionable. 

Aron  [5]  has  described  a Quantitative  Method  which  Is  an 
example  of  the  Unit  Price  Technique.  This  method  uses  the 
previously  determined  rate  of  programmer  productivity  (deliverable 
instructions  per  unit  time)  to  calculate  manpower  requirements  for 
large  system  development  efforts.  Aron  confesses  that  this  estimat- 
ing technique  is  not  a precise  method  and  in  fact  is  not  as  good 
as  an  estimate  based  on  "sound"  experience. 

Project  managers  at  IBM  have  developed  a technique  which  can 
be  classified  as  a Percent  of  Other  Item  method  [89].  This  method 
determines  the  Net  Program  Development  Time  in  man  days,  using 
program  complexity,  programmer  experience,  and  ability.  For  example, 
complexity  is  assigned  a certain  value  according  to  input/output 
characteristics  and  processing  requirements  (such  as  calculations  and 
editing).  Other  phases  of  system  development  are  determined  as 
percentages  of  the  Net  Program  Development  Time.  This  technique  has 
the  same  disadvantage  as  the  Quantitative  Method,  in  that  both 
rely  on  the  estimators  capacity  to  assign  weights  to  nebulous  var- 
iables. 

Attempts  to  obtain  Parametric  Equations  for  estimating  resources 
for  software  development  projects  have  been  unsuccessful.  In  fact, 
there  are  three  studies  (SDC  [50],  PRC  [61],  Eudy  [56])  which 
encompass  what  can  be  described  as  basic  research  into  this  problem. 

The  most  significant  study  was  conducted  by  the  System 
Development  Corporation  (SDC)  for  the  U.S.  Air  Force  from  1964  to  1966. 
Interestingly,  Dr.  Barry  Boehm,  an  acknowledged  expert  on  the  impact 
of  software,  recently  identified  the  SDC  work  as  the  most  exhaustive 
quantitative  analysis  done  to  date  on  factors  influencing  software 
development  [17].  If  Dr.  Boehm  is  correct  then  there  hasn't  been 
any  significant  basic  research  in  this  critical  area  of  software  de- 
velopoient  estimation  for  more  than  ten  years.  The  SDC  study  was 
performed  in  three  cycles,  each  consisting  of  the  collection  and 
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analysis  of  new  data  for  improving  the  results  achieved  from  previous 
cycles.  For  each  cycle,  data  were  collected  using  an  after-the-fact 
questionnaire,  completed  by  project  managers  on  programming  projects 
which  had  already  terminated.  Multiple  regression  was  applied  to 
the  numerical  data  used  in  the  SDC  studies  in  order  to  derive  es- 
timating equations.  Nelson  [110]  cautiously  observes,  however, 
that  although  multiple  regression  reveals  Important  parameters,  it 
does  not  necessarily  represent  natural  laws.  The  data  collection  on 
programming  projects  during  all  three  cycles  was  restricted  to  costs 
Incurred  during  the  programming  phase  only,  i.e.  program  design,  code, 
and  program  test.  In  1967,  the  final  product  of  the  three  year  SDC 
study  was  published  as  a Management  Handbook  [108]  on  cost  estimation 
[120,  119].  Although  this  document  is  often  cited,  there  are  no 
reported  uses  of  the  resultant  equations  by  programming  managers. 

For  example,  the  thirteen  parameter  equation  for  project  size  had 
a mean  of  40  man-months  and  a standard  deviation  of  62  man-months 
[123] . Nevertheless,  the  study  has  made  a valuable  contribution 
to  the  understanding  of  why  computer  programming  resource  estimation 
is  so  difficult.  At  least  eleven  major  publications  evolved  from 
the  work  at  SDC  [50,  51,  152,  153,  82,  107,  55,  108,  81,  32,  109]. 

Larger  and  more  expensive  programming  projects  require  dis- 
proportionately more  resources  than  smaller  projects.  The  SDC  study 
observed  this  nonlinearity  during  each  cycle  of  the  research  pro- 
ject. However,  even  though  the  nonlinearity  was  apparent,  SDC 
attempted  to  fit  the  cost  data  to  a linear  model  by  deleting  some 
points,  while  transforming  other  data  points  which  represented  the 
larger  development  efforts.  Thus,  the  SDC  work  rationalized  away 
this  important  cost  relationship.  Pietrasanta  reported  on  some 
investigations,  which  revealed  that  as  systems  increase  In  size 
there  appears  to  be  a disproportionate  Increase  required  in  man- 
hours for  development.  Perhaps  the  simplest  explanation  comes  from 
Che  definition  of  a software  system.  Such  definitions  prescribe 
that  software  systems  consist  of  two  elements:  program.3  and  inter- 
faces. Pietrasanta's  explanation  for  this  nonlinear  increase  is 
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that  a linear  Increase  in  programs  (y  « x.)  Is  accompanied  by  an 

N ^ 

exponential  increase  (y  • x^)  In  the  potential  Interfaces  between 
the  programs  [116]. 

An  outgrowth  of  the  SDC  work  was  a study  conducted  by  C.  W.  Eudy 
and  described  by  John  Gayle  [56].  Eudy  hoped  to  increase  the  re- 
liability of  the  estimating  equations  by  selecting  a set  of  data  with 
more  similar  characteristics.  For  example,  he  developed  a set  of 
equations  for  estimating  resources  required  by  programs  written  in 
a single  language  (e.g.  COBOL),  and  for  a single  computer  (e.g.  the 
IBM  360/40).  Eudy's  equation  for  manpower  contained  a dominant 
factor  which  indicated  that  an  increase  in  distance  between  the  pro- 
grammer and  the  computer  could  reduce  the  amount  of  manpower  required. 
Gayle  suggests  that  the  reason  for  this  effect  is  that  the  minor  in- 
convenience of  distance  motivates  programmers  to  check  their  work 
more  thoroughly  before  submitting  a job.  This  reasoning  is  not 
supported  by  any  data  and  it  seems  highly  Improbable  that  distance 
should  be  the  dominant  factor  in  the  equation  for  manpower. 

Simultaneously  with  the  SDC  work,  the  Planning  Research  Cor- 
poration (PRC)  [61,  62]  also  received  a U.S.  Air  Force  contract  to 
study  the  applications,  effectiveness  and  problems  of  U.S.  Air 
Force  information  processing  systems.  In  Phase  I of  this  effort, 
the  PRC  researchers  summarized  and  structured  the  Air  Force's  data 
processing  experience.  They  believed  that  the  key  to  retrieving 
experience  information  was  "workload,"  which  is  defined  to  be 
quantities  of  information  processed.  This  variable  was  chosen  be- 
cause: 

1.  "workload"  was  believed  to  be  a causal  factor  of  cost 
and  development  time, 

2.  "workload"  could  be  quantitatively  measured,  and 

3.  "workload"  factors  would  be  available  In  a proposal 
if  a thorough  analysis  of  the  problem  had  been  per- 
formed . 
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Forty  numerical  "workload"  descriptors  were  identified.  Phase  II  was 
to  confirm  what,  if  any,  relationships  existed  among  the  "workload" 
descriptors,  cost,  and  development  time.  The  PRC  study  developed 
estimating  equations  for  the  entire  development  process  (encompassing 
System  Design,  Programming,  Testing  and  Implementation  phases)  and 
subsequent  maintenance  efforts.  Historical  data  on  past  Air  Force 
programming  projects  were  tabulated  on  scales.  A plastic  template 
was  used  to  find  "relevant"  past  projects.  The  relevance  of  these 
projects  was  summed  over  a set  of  factors  in  order  to  find  a group 
of  "most  relevant"  past  projects.  Estimates  were  obtained  by 
referencing  data  from  these  "most  relevant"  projects.  Other  methods 
were  developed  to  check  the  reasonableness  of  the  estimated  values 
of  certain  factors.  The  statistical  methods  used  in  the  research 
were:  (1)  scatter  plot  analysis,  (2)  correlation  analysis,  (3) 

analysis  of  variance  and  co-variance,  (4)  multiple  regression 
analysis,  and  (5)  factor  analysis.  The  PRC  researchers  themselves 
noted  that  the  estimating  equations  displayed  wide  prediction 
intervals  because  of  their  small  sample  size.  PRC  recommended  in- 
creasing the  sample  to  200  projects  in  Phase  III.  Their  proposal 
for  Phase  III  was  not  approved  by  the  Air  Force  and  the  research 
terminated. 

Estimating  Guidelines 

Software  development  is  a complex  process  not  thoroughly  under- 
stood even  by  those  who  do  it.  As  a result,  the  accuracy  of  es- 
timates depends  on  the  skill  of  the  project  manager  to  assess  the 
influence  of  many  different  factors.  Until  the  relative  Influence 
of  these  factors  on  the  development  process  can  be  established  con- 
clusively, estimates  of  cost  will  probably  continue  to  depend  on 
the  experience  of  the  estimator,  and  qualitative  rules  of  thumb 
which  may  be  manual  or  automated. 

The  following  four  broad  rules  of  thumb  are  typical  of  those 
used  throughout  the  Industry  for  scheduling  software  development 
life  cycles. 

I 
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1.  Aron  [5]  adopted  a scheduling  rule  of  30%  planning, 

I 20%  coding,  and  50%  testing,  from  the  background  work 

which  had  been  done  by  several  researchers  at  IBM. 

2.  Brooks  [23]  considered  33%  planning,  17%  coding  and 
50%  testing  appropriate,  based  on  several  years  of 
personal  experience  at  IBM. 

3.  SDC  [101]  arrived  at  34.5%  planning,  17%  coding  and 
47.5%  testing,  by  averaging  the  experience  data  of 
the  time  consumed  by  programmers  In  four  projects 
developed  for  the  military. 

I 4.  Wolverton  [158]  reports  that  various  researchers  have 

discovered,  by  empirical  methods,  that  analysis  and 
design  account  for  40%,  coding  and  debugging  account 
for  20%,  and  checkout  and  test  account  for  40%  [36]. 

In  1972,  Boehm  [17]  reported  software  efforts  expended  on  large 
development  projects  In  slightly  different  categories.  In  Table  3, 
the  jjercentage  of  effort  for  each  development  phase  by  project  Is 
given. 


Table  3.  Percentage  of  Effort  by  Development  Phase 


Analysis 

and 

Design 

Coding 

and 

Auditing 

Checkout 

and 

Test 

SAGE 

39% 

14% 

47% 

NTDS 

30 

20 

50 

GEMINI 

36 

17 

47 

SATURN  V 

32 

24 

44 

OS/ 360 

33 

17 

50 

TRW  Survey 

46 

20 

34 

Probably  the  most  comprehensive  tutorial  on  estimating  the  cost 
of  software  development  is  a paper  by  Ray  Wolverton  [158]. 

Wolverton  reviews  the  development  process  and  the  traditional  method 
used  in  estimating,  and  identifies  many  of  the  pitfalls  in  price 
estimation.  One  of  the  roost  significant  observations  is  that,  re- 
gardless of  how  accurate  the  original  estimates,  predictions  cannot 
be  fulfilled  unless  some  mechanism  for  management  control  is  solved 
in  advance.  Estimating  and  project  control  are  usually  thought  of 
as  two  separate  functions,  but,  in  reality,  are  symbiotic.  However, 
as  late  as  1970,  the  response  to  a survey  of  some  150  management 
information  system  specialists  indicated  that  the  presence  of 
"measurable  objects"  was  of  great  Importance  to  overall  project 
success  and  that  other  attributes  of  project  control  (such  as  the 
use  of  a formal  control  system)  was  relatively  unimportant  [42] . 

Software  is  basically  an  intangible  product  during  most  of 
the  development  process.  Contrary  to  experience,  the  assumption 
is  that  changes  to  programs  not  yet  operational  are  easily  made. 
Since  software  system  modules  are  not  visibly  connected,  in  contrast 
to  hardware  systems,  the  impact  of  a change  is  often  not  readily 
apparent  even  to  the  designers  of  the  system. 

Wolverton  proposes  five  management  principles  (which  apply 
throughout  the  software  development  cycle)  to  reduce  the  problem  of 
control  to  a manageable  size. 

1.  Develop  software  documentation  (i.e.  system  specifica- 
tions) throughout  the  project  which  can  be  used  as  an 
instrument  by  which  management  controls  the  project. 

2.  Conduct  technical  reviews  of  system  design  so  that 
agreed  upon  baselines  are  established. 

3.  Control  software  physical  media  (tapes,  decks,  etc.) 
to  assure  a known  configuration  throughout  testing 
and  implementation. 

4.  Apply  stringent  confirmation  management  controls  to 
assure  that  necessary  changes,  and  their  impact  on  the 
schedule  and  the  system,  are  fully  understood. 

5.  Provide  a control  system  and  status  reporting  re- 
pository to  assure  that  changes  to  plans,  status,  and 
system  configuration,  are  available  to  all  project 
personnel  and  users.  [158] 
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Another  important  project  management  activity  involves  conducting 
system  status  reviews  so  that  problems  can  be  Identified  and  up- 
date planning  can  occur  as  required.  Plans  and  estimates  must  be 
kept  realistic. 

Wolverton  describes  in  detail  the  TRW  methodology  Which  has 
been  used  for  several  years  with  reportedly  good  results.  TRW's 
estimation  algorithm  assumes  that  costs  vary  proportionally  with 
the  number  of  instructions.  For  each  identified  routine  the 
procedure  combines  an  estimate  of  the  number  of  object  instructions, 
category  (control  routine,  input/output,  etc.),  relative  degree  of 
difficulty,  and  historical  data  (in  dollars  per  instruction)  from 
a cost  data  base.  After  these  variables  are  estimated  and  classi- 
fied, the  various  development  phases  are  identified.  An  estimate 
is  made  of  the  fraction  of  the  total  effort  to  be  allocated  for 
each  phase.  The  next  step  involves  defining  the  activities  in  each 
development  phase  by  means  of  a 25  x 7 matrix,  with  twenty  five 
activities  for  each  of  the  seven  phases.  An  associated  cost  mat- 
rix is  also  Introduced.  The  final  step  is  to  provide  the  schedule 
data  based  on  the  customer's  statement  of  work  and  other  manage- 
ment considerations.  Overhead  burden  rates  are  also  input.  The 
outputs  from  the  algorithm  include,  (a)  cost  per  routine,  (b)  a 
graphic  display  of  the  schedule,  (c)  cost  breakdown  by  development 
phase  activity,  and  (d)  manloading  and  other  details  of  cc"*"  d com- 
puter loading.  The  outputs  are  considered  only  as  an  Initi  js- 
tlmate  by  the  coat  estimation  group,  which  examines  the  information 
in  conjunction  with  other  sources  of  data  and  the  project  objectives. 

The  final  approved  set  of  estimates  is  input  to  an  official  pricing 
computer  run. 

Wolverton  is  quick  to  point  out  that  valid  data,  based  on 
proven  experience,  are  required  for  any  cost  estimation  process.  He 
emphasizes  that  there  is  "no  universal  model"  for  estimating  soft- 
ware costs  accurately. 

An  earlier  effort  to  assist  programming  managers  in  planning 
the  entire  software  development  process  was  a Planning  Guide  [52) 
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developed  by  SDC  in  1965  for  tlio  Office  of  Naval  Research.  The 
Planning  Guide  divided  the  pro(  ro'ii  development  cycle  into  eight 
phases:  (1;  systems  analysis,  (2)  design,  (3)  j ograro  development, 

(4)  coding,  (5)  checkout,  (6)  documen’ at  ion , (7)  user  training,  and 
(8)  turnover.  Various  cost  factors  were  listed  for  each  development 
phase . 

Pletrasanta  [116]  divides  manpower  into  two  categories:  (1) 
development  programmers,  and  (2)  technical  support  personnel.  Devel- 
opment programmers  design,  code  and  test.  Pletrasanta  believes  that 
more  estimation  e.Tors  are  made  in  considerations  concerning  support 
personnel  than  programmers.  The  ratio  of  programmers  to  support 
personnel  varies  with  the  development  project,  but  most  projects 
operate  on  a one  to  one  ratio.  In  some  projects  this  ratio  increases 
to  a one  to  two  ratio,  and  in  a few  large  systems,  approaches  a one  to 
three  ratio.  He  believes  that  as  the  size  of  the  system  increases, 
the  Increase  in  technical  support  personnel  is  greater  than  the  in- 
crease In  the  number  of  analysts  and  programmers. 

Tlie  following  statements  reflect  the  variety  of  productivity  and 
cost  estimating  guidelines  found  throughout  the  literature.  Each  is 
subject  to  a variety  of  qualifications.  Delaney  [40]  proposes  (as 
a guideline  for  estimating  acceptable  programmer  productivity)  a 
broad  average  of  ten  machine  Instructions  per  man-day  or  an  input 
rate  of  3000  to  3600  machine  instructions  per  man-year.  Corbato 
[23]  reported  in  1968  a mean  productivity  of  1200  debugged  PL/1 
statements  per  man-year  on  the  large  Multics  system.  Ercoli  [106] 
estimates  the  cost  of  debugged  instructions  from  $1  to  $20  per 
instruction.  A recent  Department  of  Defense  study  indicates  that 
software  for  its  airborne  computers  cost  $75  pt  r instruction  to 
develop  [141]. 

Productivity  rates  must  be  used  with  caution  because  o‘  the 
widely  varying  performance  of  programmers.  At  the  confen  ace.  of 
Software  Engineering  (Garmlsh,  1968),  David  [106]  notes  an  experiment 
conducted  by  SDC  which  reflected  a range  of  25/’  between  the  per- 
formance of  programmers  in  completing  the  solution  to  a specific 
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logic  problem.  Much  of  the  difficulty  with  using  productivity  rules 

of  thumb  in  estimating  is  the  confounding  effect  of  a myriad  of 

factors  influencing  productivity.  It  is  generally  accepted  today 

that  high-order  languages  (HOL)  and  other  software  tools  Increase 

programmer  productivity  but  there  is  considerable  difference  of 

( / 

opinion  regarding  the  magnitude  of  the  increase.  Corbato  states 
that  the  number  of  checked-out  lines  of  code  per  day,  using  either 
assembly  language  or  a HOL,  is  approximately  the  same  [78].  It  can 
be  reasoned,  therefore,  that  productivity,  for  a HOL,  would  increase 
by  a factor  of  4-6,  equivalent  to  the  average  number  of  machine  in- 
structions generated  by  a compiler  per  source  statement.  Data  from 
SDC  on  14  HOL  written  programs  (JOVIAL)  and  60  assembly-cod.ed  pro- 
grams indicate  increases  by  a factor  of  1.7  [82]. 

SOFTECH  Inc.  reports  the  use  of  a number  of  the  new  software 
management  techniques  with  impressive  results.  They  include  struc- 
tured programming.  Chief  Programmer  Teams,  and  improved  specifica- 
tion methods.  SOFTECH  has  reported  in  1974,  programming  productivity 
of  over  12,000  lines  of  debugged,  documented  and  delivered  PL/1  code 
per  man-year  [140]. 

In  an  experiment  to  evaluate  the  difference  between  time-sharing 
and  batch  programming,  Sackman  gave  the  same  two  problems  (called 
Algebra  and  Maze)  to  a group  of  twelve  programmers  having  an  average 
experience  of  6.5  years.  He  concluded  that  the  extreme  difference 
between  the  programmers'  performance  on  identical  tasks  totally  sub- 
merged any  differences  that  might  have  been  found  due  to  the  dif- 
ferent production  methods  [126].  Some  of  the  differences  are  sho%m 
graphically  in  Figure  2.  These  differences  did  not  correlate  with 
programmer  experience.  However,  the  highest  productivity  in  each 
task  was  achieved  using  a HOL  (JOVIAL),  and  the  lowest  productivity 
using  machine  language.  The  extremely  high  productivities  also 
highlight  the  effect  of  program  size  and  difficulty  (i.e.  small, 
easy  programs). 
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Rosy  (78]  and  Weinberg  [1511  Identify  additional  factors  which 
Influence  programmer  productivity.  It  should  be  clear  from  the 
summary  of  observations  and  experiments  that  no  standard  definition 
of  productivity  exists.  The  numbers  presented  are  not  comparable 
because  of  the  <Jif ferences  in  how  productivity  is  measured.  Thus, 
any  assessment  of  trends  In  productivity,  or  the  use  of  such  a 
nebulous  entity  for  accurately  estimating  softward  development,  is 
generally  an  exercise  in  futility. 

The  major  thesis  of  an  IBM  technical  report  by  Myers  (104)  was 
the  consideration  of  risk  relative  to  cost  estimates  for  software 
development.  Risk  is  defined  as  the  probability  that  the  actual 
costs  will  exceed  the  estimated  costs  which  Myers  viewed  as  the 
Inverse  of  confidence;  that  Is,  if  the  risk  Is  25%,  then  one  can 
be  75%  confident  that  the  actual  costs  will  not  exceed  the  esti- 
mated costs.  Myers  also  presented  a software  development  project 
estimating  technique.  The  estimate  Is  expressed  In  the  form  of  a 
probability  distribution  function,  allowing  the  following  types 
of  questions  to  be  answered: 

1.  What  Is  the  probability  that  actual  costs  will  not 
exceed  x dollars  (or  man-months,  etc.)? 

2.  For  a risk  of  x percent,  what  Is  the  estimated  cost? 

3.  What  is  the  probability  that  actual  costs  will  be 
between  x and  y dollars?  [104] 

Myers  limited  the  use  of  the  technique  to  projects  requiring  fif- 
teen or  more  people  and  the  duration  of  the  project  to  a range  of 
eighteen  to  thirty  months.  The  technique  requires  estimates  of 
Input  variables  such  as  the  number  of  program  Instructions,  pro- 
gram difficulty,  program  size,  and  average  competence  of  the  staff. 
Each  variable  Is  given  three  values  (pessimistic,  most  likely, and 
optimistic).  Difficulty  and  size  variables  are  multiplied  together 
to  obtain  a complexity  Index.  Using  this  Index,  productivity 
values  ( Inst ructions /day) , for  either  assembly  language  or  source 
language  programming,  are  extracted  from  tables  contained  In  the 
automated  system.  Ultimately  three  values  of  the  man-months 
required  are  determined  by  simply  dividing  the  estimated  number 
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of  Instructions  by  the  productivity  values.  To  these  computed 
I values  are  added  estimates  for  "support"  manpower  using  Pietrasanta' s 

(115)  findings  (i.e.  depending  on  the  size  of  the  project,  support 
j requires  from  one  half  to  three  times  as  much  of  the  direct  technical 

manpower  needed).  The  complete  estimate,  including  the  associated 
j risk.  Is  developed  using  the  beta  function  commonly  associated  with 

PERT  for  describing  the  time  necessary  to  perform  a function.  Albeit 

I a method,  there  are  a considerable  number  of  assumptions  made  which 

make  this  technique  as  unpredictable  In  terms  of  estimating  accuracy 
as  most  other  methods.  At  the  outset,  Myers  states  that  It  Is  re- 
[ latlvely  easy  to  estimate  the  expected  size  (number  of  program  In- 

structions) of  each  module,  (underline  added)  a modular  design 
of  the  system  has  already  been  accomplished.  However,  requirements 
analysis  and  system  design  are  major  portions  of  the  software  de- 
velopment process  and  cannot  be  Ignored  In  the  total  project  estimat- 
ing endeavor.  Pietrasanta  [115]  points  out  that  the  validity  of  the 
estimate  decreases  as  the  list  of  assumptions  grows  In  making  es- 
timates. 

In  contrast  to  Myers'  computation  of  risk,  Aron  [5]  prefers  the 
following  discussion  of  "safety  factors."  Safety  factor  Is  de- 
fined as  the  probability  that  the  actual  costs  will  not  exceed  the 
estimated  costs.  It  represents  the  estimator's  confidence  In 
understanding  the  problem  and  the  likelihood  that  the  assumptions 
will  be  satisfied.  Safety  factors  are  converted  Into  additional 
costs  and  time  to  allow  for  errors  In  the  estimate.  Given  an 
estimate  which  Includes  a safety  factor,  the  project  manager  may 
take  a "risk"  as  proposed  by  Myers.  The  risk  represents  the 
manager's  belief  that  the  estimate  can  be  achieved  In  spite  of  Its 
potential  error. 

Considering  the  record  for  grossly  underestimating  software 
development  and  the  heightened  awareness  of  the  complexity  of  this 
task,  single  valued  estimates  for  software  development  should  be 
considered  Inadequate  for  proper  management.  Conventional  estltuatlng 
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typically  introduces  the  most  likely  value  for  each  element  or  ac- 
tivity to  determine  the  total  economic  implications  for  the  system 
to  be  developed.  Since  distributions  of  cost  are  known  to  be 
skewed  upwards,  a single  value  estimate  computed  this  way  has  very 
little  likelihood  of  occurring  and  may  constitute  a serious  under- 
statement of  the  true  cost.  Where  a single  value  is  provided,  the 
mean  or  the  modal  value  is  generally  more  meaningful.  In  systems 
development  efforts,  which  are  characterized  by  considerable  re- 
search and  development,  these  latter  values  (mean,  modal)  have 
been  shown  to  exceed  the  conventional  estimate  by  more  than  30% 

[11. 

It  is  apparent  that  a conventional  estimate  is  no  better  than 
the  quality  of  its  Inputs.  An  estimator's  ability  to  identify  all 
cost  contributing  factors,  combined  with  the  necessity  to  assess 
each  known  factor's  economic  implication,  bears  directly  on  the 
validity  of  the  end  product.  It  is  just  as  difficult  to  appraise 
the  range  of  uncertainty  which  Influences  each  cost  factor.  Never- 
theless, the  manager  must  diligently  search  for,  and  determine  the 
magnitude  of,  factors  which  give  rise  to  estimate  variability.  It 
is  true  that  the  estimator's  ability  to  describe  uncertainty  is 
subjective.  However,  because  of  the  general  unavailability  of 
completely  relevant  historical  data  from  which  estimates  could  be 
more  accurately  gauged,  single  valued  estimates  also  contain  varying 
degrees  of  subjectivity.  It  is  significant  that,  although  es- 
timators in  the  past  have  been  aware  of  these  uncertainties,  there 
did  not  exist  a means  to  assess  the  cumulative  Impact  on  the  total 
estimate.  Techniques  for  evaluating  uncertainty  are  available  and 
should  be  used  [114,  139,  44,  64].  Murphy  [103]  has  described  an 
analytical  method  which  does  not  improve  the  precision  of  an  es- 
timate, but  which  does  place  the  estimated  cost  value  in  perspective 
with  respect  to  decisions  made  using  uncertain  values.  Although 
these  requirements  are  an  added  burden,  the  estimator  will  un- 
doubtedly derive  greater  satisfaction  from  pursuing  this  task  than 


trying  to  generate  a single  number  In  the  face  of  Imperfect  Informa 
tlon.  Even  the  simple  appoaohes  proposed  by  Myers  and  Aron  provide 
some  indication  to  management  of  the  capriciousness  of  software 
development  estimates. 

A checklist  offered  by  McClure  [106]  consists  of  statements 
to  be  considered  prior  to  undertaking  new  software  projects.  For 
example,  one  statement  notes  that,  "Interfacing  this  system  to  the 
rest  of  the  software  is  trivial  and  can  be  easily  worked  out  later. 
For  each  statement  checked  on  the  list  McClure  recommends  adding 
10%  to  the  estimated  cost  and  one  month  to  the  estimated  time. 

Schwartz  has  neatly  summarized  the  various  maxims  used  in 
planning  and  budgeting  software  development  efforts.  Some  of  these 
are  listed  below.  He  points  out  that  facts  available  to  assist  In 
estimation  are  limited  in  preciseness  but  they  are  based  on  con- 
siderable experience. 

1.  It  Is  generally  accepted  that  software  costs  are  more 
than  hardware  costs. 

2.  The  dollar  cost  per  ultimately  delivered  computer  in- 
struction Is  one  measure.  Estimates  which  vary  from 
$1  to  more  than  $50  an  instruction  are  not  unusual 
with  $5  for  average  "fair"  small  jobs.  Larger  and 
more  complicated  jobs  lead  to  higher  potential  costs. 

3.  Underestimation  by  a factor  of  2.5  or  more  is  not 
unusual.  Overestimation  is  almost  unheard  of. 

4.  For  tasks  similar  to  ones  performed  previously  by  the 
same  people,  estimates  within  10%  to  30%  are  not 
uncommon. 

5.  Instructions/hour/man  for  some  large  systems  can  be 
around  3,  or  go  as  low  as  less  than  1.  For  100,000 
instructions,  this  may  mean  at  best  10-50  or  more  man- 
years;  for  1 million  maybe  at  best  100-500  average 
man-years  or  more.  Estimates  of  much  more  than  this 
have  been  made  for  some  efforts  such  as  IBM’s  OS/360, 
where  some  estimates  state  that  0.2  Instructions/ 
hour  were  realized. 

6.  All  estimates  are  functions  of  complexity,  people 
newness,  tools,  size,  hardware,  programming  language 
used,  and  other  factors. 

7.  Other  quantitative  values  for  the  programming  pro- 
cess have  been  given.  One  Is  that  of  the  total 
time  (after  Requirements  Analysis)  30%  Is  devoted 
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to  design,  40%  to  programming,  and  30%  to  system 
testing.  These  are  approximately  equal  to  other 
estimates.  (Programming  Includes  initial  checkout 
in  this  case . ) 

8.  Management  and  support  use  half  of  the  resources. 

In  other  words,  for  every  programmer  assume  one 
other  person,  not  a programmer,  on  the  project. 

9.  During  the  last  half  or  more  of  the  project,  com- 
puter time  averages  around  8 hours  per  month  per 
man. 

10.  Averages  are  dangerous.  To  live  by  averages  in 
this  case  is  like  saying  the  average  temperature 
In  a desert  Is  60“  (0“  for  half  the  year,  120“  for 
the  rest). 

11.  The  rate  of  spending  increases  over  time. 

12.  There  is  a frequent  desire  to  shorten  elapsed  time 
by  various  methods,  such  as  adding  more  people. 

But  one  can  compress  schedules  only  so  much.  Also, 
paradoxically,  estimates  have  been  made  that  show 
that  the  same  Job  done  in  a shorter  time  using 
more  people  costs  more  than  one  where  more  time 

is  used.  [129] 

Schwartz  concludes  that  the  available  qualitative  ideas  do  not 
provide  a precise  quantitative  estimating  capability.  There  is 
little  quantitative  help  for  deciding  the  "complexity"  of  a 
system,  the  quality  of  the  requirements,  the  type  of  programmers, 
the  quality  and  timeliness  of  the  tools,  etc.  However,  he  suggests 
that  since  these  ideas  are  based  on  experience  they  counter  the 
natural  urge  to  be  overopt Imlstic  and  should  not  be  Ignored.  While 
these  ideas  and  aids  do  not  represent  natural  laws,  Schwartz  be- 
lieves that  they  are  statistically  accurate.  The  following  is  an 
often  quoted  observation  from  the  writing  of  Alfred  Pietrasanta 
regarding  the  qualitative  and  quantitative  aspects  of  estimating 
software  development. 

The  problem  of  resource  estimating  of  computer  program 
system  development  is  fundamentally  qualitative  rather 
than  quantitative.  We  do  not  understand  what  has  to  be 
estimated  well  enough  to  make  accurate  estimates. 

Quantitative  analyses  of  resources  will  augment  our 
qualitative  understanding  of  program  system  development, 
but  such  analyses  will  never  substitute  for  this  under- 
standing. [116] 


t 


48 


I 

I 

I It  is  apparent  that  there  is  a wide  variety  of  techniques,  lore 

and  experience  that  are  used  in  attempts  to  provide  some  kind  of 
guidelines  to  estimating  software  development.  It  is  also  interesting 
I to  note  that  very  little  correlation  has  been  found  between  estimating 

ability  and  various  levels  of  educational  accomplishments  [49]. 
Lulejlan  and  Associates,  under  contract  to  the  U.S.  Air  Force, 
analyzed  several  software  development  characteristics  for  a pack- 
age of  eighty  eight  routines  developed  by  TRW  Systems  Corporation. 

When  visually  plotted,  with  use  of  regression  analysis,  the  data 
Indicated  (a)  that  larger  routines  cost  more, and  (b)  a linear  fit 
to  the  data  is  not  very  good.  Furthermore,  the  variables  of  pro- 
gram difficulty  and  programmer  experience  could  not  be  shown  to  be 
of  any  help  in  estimating  [18,  63]. 

Literature  about  Estimating  and  Management 

In  a doctoral  dissertation  at  Syracuse  University,  Adams  [2]  in- 
vestigated the  estimation  process  to  determine  what  factors  in- 
fluence the  accuracy  of  estimates.  His  interview  data  and  quantita- 
tive analyses  both  strongly  supported  the  five  following  hypotheses: 

1.  A positive  correlation  between  accuracy  and  amount  of 
information  sought  and  processed. 

2.  A positive  correlation  between  managerial  talent  and 
amount  of  Information  sought  and  processed. 

3.  A positive  correlation  between  the  perceived  importance 
of  accurate  estimates  and  the  amount  of  Information 
sought  and  processed. 

4.  A positive  correlation  between  the  measure  of  managerial 
talent  and  the  perceived  importance  of  estimates. 

5.  Estimates  tend  to  become  a target  to  those  who  must 
accomplish  the  activity. 

Adams  states  that  his  primary  conclusion  is  that  the  accuracy  of 
estimates  in  a project  management  environment  is  controlled  by  the 
estimator  to  an  extent  that  even  he  hlmeslf  is  not  aware  of. 
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Tslchrltzis  [147]  identified  two  schools  of  thought  for  the 
difficulties  in  managing  software  projects:  (1)  poor  management, 
and  (2)  random  activity,  i.e.  the  production  of  software  is  not  a 
deterministic  activity  since  specifications  are  modified,  personnel 
change,  and  planning  tools  are  lacking. 

Dr.  Frederick  Brooks  [23]  has  written  an  excellent  book  which 
can  be  used  as  a pocket  reference  to  DO's  and  DON'T's  of  software 
development.  The  book  provides  a compendium  of  the  mistakes  most 
software  managers  have  made  along  with  the  techniques  being  tried  by 
many.  Brooks  suggests  that  software  managers  are  gutless  estimators 
and  eternal  optimists.  There  also  seems  to  exist  a fear  of  updating 
sacrosanct  plans.  The  emphasis  In  planning  should  not  be,  as  hard- 
ware engineer  P.  Fagg  suggests,  to  "take  no  small  slips,"  to  allow 
the  work  to  be  done  carefully  [23].  Managers  should  emphasize 
required  rescheduling,  as  necessary,  so  that  everyone  Is  not  working 
frantically  toward  an  Impossible  goal  which  ruins  morale,  as  well  as 
the  system,  and  Inevitably  falls  to  meet  the  deadline.  Brooks  re- 
flects considerable  Insight  Into  the  nuances  of  management  of  soft- 
ware development  and  the  psychology  of  technicians  and  first  level 
managers  In  the  reporting  of  problems  and  delays.  Detailed  planning 
and  project  control  are  not  yet  as  generally  accepted  and  applied 
to  the  development  of  software  as  they  are  to  other  "engineering" 
research  and  development  efforts.  Brooks  also  addresses  the  "role 
conflict"  In  project  control  systems,  I.e.  distinguishing  between 
action  Information  and  status  information.  Managers  also  attempt 
to  use  project  control  systems  for  personnel  evaluation  and  control. 
Managers  have  difficulty  distinguishing  between  project  control  and 
people  control.  Using  the  data  reported  by  people  to  manage  those 
same  people  results  In  unreliable  Input  and  a loss  of  control. 

People  must  be  managed  by  people,  not  by  automated  systems.  Every- 
one speaks  of  the  need  for  better  data  but  there  are  no  real  com- 
mitments to  collecting  It. 

There  can  be  no  question  that  Dr.  Brooks  knows  whereof  he 
speaks  but  the  book  Is  no  more  than  what  the  author  claims — an 
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accumulation  of  the  loro  In  the  field  and  his  personal  views.  If 
software  development  management  Is  frustrating  then  this  is  a book, 
about  frustration  which  leaves  one  a little  frustrated,  for  it  offers 
no  new  thoughts  on  how  to  do  the  job  better. 

The  most  recent  paper  on  the  subject  of  estimating  was  again 
sponsored  by  research  funds  from  the  U.S.  Air  Force  in  cooperation 
with  the  U.S.  Army  [137].  The  report  is  an  expensive  literature 
survey  which  concludes  that  (1)  there  is  no  widely  accepted  method- 
ology for  estimating  resource  requirements  for  a software  development 
project,  and  (2)  structured  programming  technology  does  not  introduce 
new  techniques  for  estimating.  The  technical  report  also  draws  the 
following  conclusions  from  the  survey  of  the  literature  on  software 
estimating. 

1.  Four  resources  are  critical  in  the  management  of  soft- 
ware development.  They  are  manpower,  computer  time, 
money,  and  elapsed  time. 

2.  The  quantitative  techniques  being  used  to  estimate 
manpower  resources  use  formulas  and  average  produc- 
tivity tables. 

3.  The  similar  experience  technique  of  comparing  new 
system  requirements  with  experience  gained  on  similar 
completed  systems  is  the  most  frequently  used  method 
for  estimating  resources. 

4.  Software  estimating  studies  which  have  sought  ways 
to  make  programmers  more  productive  have  failed  to 
first  establish  a baseline  of  productivity. 

5.  The  key  to  more  accurate  estimating  is  the 
collection,  retention,  and  availability  of  valid 
historical  data.  [137] 

This  writer  participated  in  a study  by  Scott  [130]  on  the  fac- 
tors affecting  programmer  productivity  and  the  cost  of  software 
development.  Scott  used  the  Delphi  procedure  which  has  three 
essential  features:  (a)  anonymous  response,  (b)  iteration  and 
controlled  feedback  and  (c)  statistical  group  response.  These 
features  minimize  the  biasing  effects  of  dominant  Individuals,  ir- 
relevant communications  and  group  pressure  toward  conformity.  While 
Delphi  is  a useful  device  for  producing  consensus,  Farquhar  [49] 
points  out  that  it  is  not  a magical  tool  for  producing  a right  an- 
swer. The  people  surveyed  were  either  selected  because  they  were 
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currently  managing  programming  projects,  or  because  they  were  con- 
sidered experts  In  the  management  of  progranmlng  projects.  Panel 
members  were  asked  to  correlate,  on  a scale  from  minus  seven  to  plus 
seven,  the  effect  of  Increasing  the  magnitude  for  each  of  35  vari- 
ables on  programmer  productivity.  One  of  the  variables  which  re- 
mained highly  controversial  as  to  Its  Importance  to  programmer 
productivity  was  "program  size."  The  study  results  did  clearly 
demonstrate  the  importance  that  programming  project  managers  place 
on  providing  the  working  programmer  with  a well  documented,  thoroughly 
defined.  Independent  task.  Hill  also  prescribed  documentation  as  the 
control  element  for  software  development  [53].  Experienced  program- 
mers working  in  high  level  languages  also  emerged  as  Important  factors. 

Weinberg's  book.  The  Psychology  of  Computer  Programming  (151) 
which  discusses, among  other  things,  the  concept  of  egoless  program- 
ming, will  have  significant  Influence  on  the  management  of  software 
development.  Briefly,  egoless  programming  means  a programmer  does 
not  own  his  programs.  Programs  are  not  an  extension  of  the  pro- 
grammer's self  Image  but  merely  things  the  programmer  happened  to 
work  on.  Weinberg  studied  the  psychological  factors  affecting  pro- 
grammer productivity  and  concluded  that  personality  is  a more 
Important  factor  than  Intelligence  In  programming.  That  Is,  there 
are  more  people  unsulted  temperamentally  for  programming  than 
deficient  in  ability  to  learn  the  job  (65).  Fleischer  [54]  points 
out  that  a trained  person,  communicating  with  other  humans.  Is 
rarely  an  obvious  failure.  However,  a programmer  comnun lea ting  with 
computers  becomes  accustomed  to  the  experience  of  frequent  total 
failures!  These  experiences  can  lead  a person,  who  identifies  him- 
self with  his  programs,  to  have  a very  defensive  state  of  mind.  Dick 
Brandon  [19]  has  described  the  programmer's  personality  as  excessively 
Independent  to  the  point  of  mild  paranoia,  egocentric,  slightly  neu- 
rotic, bordering  on  limited  schizophrenia.  Brandon  probably  said 
this  with  tongue  In  cheek,  for  Weinberg  claims  that  many  software 
houses  and  other  programming  teams  have  built  their  success  on  an 
egoless  programming  organization.  If  Weinberg's  Ideas  had  to  be 


52 


I 

I 

I 

summed  up  In  one  sentence  it  would  be  that  "the  human  element  of 
I computer  usage  is  important"  [54). 

To  test  the  influence  of  assumed  goals,  Weinberg  gave  four  pro- 
I gramners  the  same  problem.  He  directed  two  of  them  to  develop  a 

fully  checked  out  program  that  was  as  efficient  as  possible  in 
j computer  time.  The  other  two  were  asked  to  do  the  job  as  quickly 

as  possible.  The  latter  group  used  on  the  average,  only  40%  of  the 
- machine  time  and  33%  of  the  effort  of  the  other  group,  but  these 

> programs  ran  ten  times  slower  on  the  average.  These  observations 

affect  estimating  only  in  the  sense  that  it  is  important  for  a 
[ manager  Involved  in  estimating  to  take  into  account  the  influence 

of  organizational  objectives  and  Individual  personality. 

Myers  [104]  observed  that  it  is  also  important  to  separate  the 
terms  of  "estimate  and  goal."  Since  "work  expands  to  fit  the 
allotted  time"  (or  cost)  failure  is  assured  if  "goals"  and  "es- 
timates at  X percent  risk"  are  equated.  Goals  should  be  represent- 
ed by  smaller  values  according  to  Myers.  However,  the  estimating 
problem  of  software  development  has  not  been  the  expansion  of  work 
to  fill  allotted  time,  but  a significant  over  expenditure  of  re- 
sources far  beyond  the  allotted  time.  Estimates  must  be  made 
as  accurately,  intelligently,  and  honestly  as  possible  if  those 
Involved  are  going  to  support  a plan.  That  is  not  to  say  that 
external  commitment  dates  for  the  final  product  cannot  be  estab- 
lished sometime  beyond  the  estimate.  In  this  sense  Myers  has 
reversed  the  idea  of  estimate  and  goal  and  is  certainly  talking 
about  different  goals  than  those  which  influenced  productivity  in 
Weinberg's  experiments. 

Carter,  Gibson  and  Rademacher  [28]  conducted  a study  for  the 
U.S.  Air  Force  Office  of  Scientific  Research  in  an  attempt  to 
isolate  the  factors  which  determine  the  success  or  failure  of  a 
system  development  effort.  The  study  states  that  planning,  con- 
trol, and  implementation  of  a system  development  effort  could  be 
accomplished  more  efficiently  if  management  were  able  to  quantify 
each  critical  factor  in  the  system  undertaking.  Wright  stated  as 


a result  of  his  work  In  Information  systems,  that  there  are  a 
limited  number  of  reasons  for  system  failure.  As  In  any  other  type 
of  analysis  It  Is  Important  to  segregate  the  "vital  few  from  the 
trivial  many." 

1.  System  too  sophisticated  and  ambitious. 

2.  Application  not  sound. 

3.  Systems  people  assumed — and  management  abdicated — 
responsibility  for  system  design. 

4.  Designed  to  supplant — not  support — the  user. 

5.  Optimistic  Implementation. 

6.  Company  Incapable  of  managing  with  a system.  [28] 

The  Carter  study  reduced  the  critical  factors  to  fourteen 

through  mailed  checklists  and  followup  surveys.  Factor  clusters 
were  then  determined,  and  five  general  classifications  emerged.  The 
four  lowest  rated  factors  were  generally  unrelated  and  were  per- 
ceived to  be  of  little  Importance  (l.e.  Employee  Resistance,  User 
Attitude  Toward  Design  Team,  etc.).  The  other  four  classifications 
were  (1)  User  Involvement,  (2)  Management  and  User  Attitude,  (3) 
Organizational  Planning  and  Control,  (4)  Systems  Expertise.  Accord- 
ing to  the  researchers  a technique  for  measuring  organizational 
planning  and  control  of  the  systems  effort  Is  under  development. 
Carter  also  reports  that  the  research  group  has  underway  a search 
for  tests,  scales  and  other  Instruments  to  measure  personnel  exper- 
tise. One  of  the  objectives  of  the  study  Is  to  determine  factors 
critical  to  successful  system  development.  What  Is  "successful" 
system  development.  Is  the  question  which  begs  to  be  asked.  If 
Carter's  group  finds  simple  methods  for  quantifying  the  generalized 
classifications  of  factors  noted  above,  what  will  the  measurement 
of  these  classifications  tell  us  In  terms  of  predicting  the  "success 
of  a development  effort? 

Under  U.S.  Air  Force  sponsorship  in  the  CCIP-85  projects.  Dr. 
Barry  Boehm,  head  of  the  Information  Sciences  and  Mathematics 
Department  at  the  RAND  Corporation,  suggested  several  opportunities 
for  reducing  software  delays.  These  fall  Into  three  main  categories 
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1.  Increasing  each  individual's  software  productivity, 

2. '  Improving  project  organization  and  management,  and 

3.  Initiating  software  development  earlier  in  the 
system  development  cycle  (17]. 

The  concern  for  Improvement  of  programmer  productivity  is  a 
concern  for  the  smallest  factor  affecting  software  development. 

Table  3 (page  37)  clearly  reflects  where  most  of  the  software  effort 
is  expended.  It  can  be  seen  from  the  table  that  only  15%  of  a 
typical  software  effort  goes  Into  coding.  Clearly,  then,  there  is 
more  potential  payoff  in  improving  the  efficiency  of  analysis  and 
validation  efforts  than  in  faster  coding.  Certainly  these  develop- 
ment phases  are  more  difficult  to  estimate  accurately  than  is 
programming.  Boehm  also  reported  that  the  largest  source  of  Defense 
Department  software  costs,  in  both  the  development  and  maintenance 
phases,  is  the  cost  of  accommodating  user  requirement  changes. 

The  entire  development  process  must  be  examined.  The  time 
consuming  and  schedule  destroying,  cyclic  development  which  occurs 
during  the  program  and  system  testing  is  directly  related  to  in- 
complete problem  definition  and  analysis.  The  CCIP-85  study  group 
recomsended  improving  programmer  productivity  and  initiating  soft- 
ware development  earlier  in  the  development  cycle  [17,  24).  These 
accommodations  are  still  attempts  to  administer  to  only  the  symptoms 
(high  cost,  long  development  time)  of  the  delay  dllengna  created  by 
poor  estimating.  Dr.  Boehm,  in  his  article,  did  not  address  the 
underlying  problems  of  improving  total  software  project  organiza- 
tion and  management  (estimating). 

Boehm  did  acknowledge,  however,  that  the  problems  of  software 
productivity  on  medium  or  large  projects  are  largely  problems  of 
management:  thorough  organization,  good  contingency  planning, 
thoughtful  establishment  of  measurable  project  milestones,  continu- 
ous monitoring  on  whether  the  milestones  are  properly  passed,  and 
prompt  investigation  and  corrective  action  in  case  they  are  not.  In 
the  software  management  area,  one  of  the  major  difficulties  is  the 
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c transfer  of  experience  from  one  project  to  the  next.  For  example, 

■ many  of  the  lessons  learned  as  far  back  as  SAGE  are  often  Ignored 

in  today's  software  development.  In  1961  Hosier  published  an 
I article  on  the  value  of  milestones,  test  plans,  precise  Interface 

specifications,  concurrent  system  development,  and  performance 
I analysis,  etc.  [17].  If  It  were  known  where  most  of  the  time  is 

spent  during  development,  then  subsequent  research  could  attack 
! these  time  consumers  [12].  Nevertheless,  beyond  these  familiar  con- 

cessions to  classic  management  theory,  the  CCIP-85  study  group 
, offered  the  Air  Force  no  new  approaches  to  identifying  why  it  doesn't 

work  for  software  development.  TRW  Corporation  found  that  they  could 
Improve  their  estimates  by  increasing  their  understanding  of  exactly 
what  steps  and  processes  were  Involved  in  software  development,  thus 
enabling  better  management^  the  better  management,  in  turn,  im- 
proved estimates  [158]. 

Another  paper  worthy  of  note  is  "The  MUDD  Report"  [155].  This 
report  is  the  result  of  a year-long  investigation  into  Navy  soft- 
ware problems.  The  report  chronicles  the  development  of  a mythical 
software  system  and  describes  where  and  how  the  developers  went 
awry.  It  is  based  on  more  than  thirty  interviews  with  people  res- 
ponsible for  development  of  various  kinds  of  Navy  software  in  ten 
different  organizations.  The  emphasis  is  on  the  difficulties  which 
caused  schedule  slippages  and  cost  overruns.  Although  fictionalized, 
the  report  is  an  excellent  example  of  the  kind  of  post  mortem  ana- 
lyses which  should  be  accomplished  for  large  software  efforts  that 
fall  to  meet  their  deadlines  and  far  exceed  estimated  costs.  An 
analytical  vehicle  is  needed  to  promote  sharing  of  valuable  ex- 
periences, both  good  and  bad.  Judith  Clapp  [30]  states  the  managers 
who  must  supply  data  on  actual  expenses  and  failure  rates  which 
occur  during  development  feel  this  would  be  used  to  Judge  them. 

Tom  Glib  [57,  58],  an  EDP  consultant  in  Norway,  speaks  of 
an  emerging  technology  in  software  metrics.  He  strongly  believes 
that  all  interesting  properties  of  software  are  indeed  measurable 
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enough  to  provide  practical  control  over  the  cost  and  effort  of 
these  properties  In  a system  [3]. 

The  book  Practical  Strategies  for  Developing  Large  Software 
Systems  [67]  is  a collection  of  papers  by  some  of  the  most  renowned 
researchers  and  practitioners  in  managing  software  development.  It 
is  a result  of  a one  week  seminar  held  at  the  University  of  Southern 
California  in  1974  on  the  "Modern  Techniques  for  the  Design  and  Con- 
struction of  Reliable  Software."  The  intent  of  the  seminar  was  to 
collect  experts  from  the  industry  who  were  Intimately  Involved  with 
the  software  productivity  Issue.  By  emphasizing  the  industrial  point 
of  view,  the  attendees  obtained  a true  assessment  of  the  extent  to 
which  various  new  strategies  were  actually  being  practiced.  The 
editor,  Ellis  Horowitz,  observes  that  the  industry's  inability  to 
effectively  construct  complex  software  systems  has  Impeded  its 
ability  to  make  good  use  of  computer  resources.  Furthermore, 

Horowitz  emphasizes  the  role  of  the  manager  and  programmer  in  system 
building  environments.  The  problems  encountered  in  choosing  the 
right  people,  tools,  schedules  and  procedures,  will,  if  done  im- 
properly, nullify  the  effects  of  the  finest  technology. 

Canning  [27]  has  proposed  a method  on  how  to  manage  computer 
programs  effectively.  The  management  techniques  used  to  make  a 
profit  on  fixed  price  software  development  projects.  In  various 
companies,  are  cited  [83].  At  the  1975  International  Conference  on 
Reliable  Software,  R.  Williams  also  described  some  techniques  found 
to  be  successful  in  managing  the  development  of  software  [156]. 

Almost  everyone  in  the  Industry  could  provide  a comprehensive, 
valid  list  of  the  reasons  why  software  managers  cannot  estimate 
accurately  [135,  144].  One  obvious  reason  is  that  managers  refuse 
to  accomplish  the  work  necessary  to  do  the  Job  well.  Managers 
seem  to  be  afraid  to  try  to  master  this  task.  Equations  are  sought 
which  will  do  the  Job  and  remove  the  manager  from  any  culpability. 
Estimatore  seek  anonymity  by  hiding  behind  general  rules  of  thumb. 
Methodologies  used  employ  vague  factors  like  "complexity"  to  hedge 
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estimates  [37].  Myers  has  listed  eight  of  the  major  reasons  for 
poor  estimates: 

1.  Programming  Is  still  In  Its  Infancy  and  does  not 
approach  the  level  of  most  primitive  engineering 
disciplines . 

2.  The  nature  of  a "system"  Is  not  well  understood. 

3.  Estimators  do  not  understand  the  economics  of 
computer  programming. 

I*.  Estimators  do  not  fully  understand  the  system 
development  process. 

5.  Estimators  assume  that  "all  will  go  well." 

6.  Estimators  do  not  connect  risk  and  cost.  Cost 
Is  a probabllstlc , not  a deterministic  variable. 

7.  There  Is  a lack  of  reliable  historical  data  on 
project  costs. 

8.  Most  programmers  are  optimists.  [104] 


A New  Beginning 


Malcom  Jones  [71]  of  MIT  observes  that,  "although  many  unknowns 
are  faced  In  developing  new  software  systems,  management  should  not 
use  this  as  an  excuse  for  abdicating  Its  responsibility  for  planning." 
This  Is  particularly  Inexcusable  where  a good  control  system  Is 
available. 

Judith  Clapp  [30]  of  MITRE  Corporation  has  focused  on  the 
common  problems  In  software  development: 

1.  Reduce  the  cost  of  computer  software  development, 
acquisition,  and  maintenance,  and 

2.  Improve  the  quality  of  software  products. 

But  more  Important,  Ms.  Clapp  has  put  her  finger  on  the  approach 
needed  to  resolve  these  problems  and  the  ongoing  proliferation  and 
fragmentation  of  effort. 

The  referenced  research  is  necessary;  much  of  it  is  of 
excellent  quality.  Our  problem  and  our  objective  Is  to 
distill,  synthesize,  exploit,  and  develop  these  efforts 
Into  a coherent  body  of  engineering  knowledge  and 
methodology.  [31] 
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CHAPTER  IV 

DATA  SOURCE 


Data  Availability 

The  continual  lament  of  researchers  In  the  area  of  software 
planning  and  estimating  has  been  the  unavailability  of  adequate 
historical  data  [49,  110]. 

Dr.  Barry  Boehm  has  stated  that  one  of  the  major  problems 
encountered  in  a large  study  done  for  the  Air  Force  regarding 
data  automation  implications  of  the  1980* s [17],  was  the  dearth 
of  hard  data  available  on  soft-  're  efforts  which  would  allow  the 
researchers  to  analyze  the  nature  of  software  problems.  Judith 
Clapp,  in  another,  more  recent,  study  done  for  the  Air  Force  on 
Engineering  of  Quality  Software  Systems  [30]  noted  that  the  in- 
ability of  management  to  correctly  estimate  software  development 
can  be  attributed  to: 

1.  The  lack  of  definition  of  the  work  to  be  done, 

2.  The  lack  of  reliable  historical  cost  data,  and 

3.  The  lack  of  understanding  of  major  factors  which 
affect  time  and  cost. 

The  New  Data  Problems 

As  software  development  project  control  systems  become  more 
prevalent  more  historical  data  are  being  accumulated.  Data  re- 
liability, consistency  of  meaning,  and  appropriateness  of  the 
data  to  specific  research  are  concerns  which  have  replaced  the 
original  problem  of  data  availability. 

Reliability  will  be  predominately  affected  by  the  control 
system  and  by  Che  management  atmosphere  within  the  organization. 
That  Is,  data  collection  and  reporting  procedures  In  a project 
coatrel  system  significantly  Impact  data  reliability.  How 
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^ the  information  (from  the  control  system)  Is  used  by  management  can 

I enhance  or  destroy  the  reliability  of  the  data.  For  example, 

Inaccurate  reporting  may  result  from  using  a project  control  system 
I for  man-hour  accounting  as  well  as  for  projecting  costs  and  status 

reporting.  False  reporting  to  reflect  a productive  eight  hour  day  is 
j a defense  commonly  used  by  personnel  wary  of  such  automated  spies 

(96]. 

I Standardized  definitions  of  the  resource  consuming  activities 

* along  with  consistent  units  of  measure  should  be  established 

(throughout  the  industry  and  the  government.  If  such  standards  are 
not  forthcoming  there  will  continue  to  be  a serious  problem  with 
the  consistency  of  meaning  of  the  historical  data  among  different 
f organizations. 

Clearly,  the  data  elements,  and  their  particular  meanings,  in 
the  various  project  control  systems  may  only  be  used  to  support  a 
specified,  limited,  set  of  research  objectives.  That  is,  the  data 
must  be  appropriate  to  the  questions  examined.  Today's  project 
control  systems  collect  data  for  project  status  reporting  and 
management  information,  as  opposed  to  also  accumulating 
complementary  data  which  could  be  most  useful  in  research.  Such 
complementary  data  might  include:  (a)  reason  codes  for  small  cost 
and  schedule  changes  or  overruns,  (b)  information  regarding  changes 
to  system  design  and  their  total  Impact  on  cost  and  schedule,  (c) 
recording  of  assumptions  on  which  original  plans  were  based,  (d) 
basis  for  estimates,  etc.  This  is,  however,  a subject  worthy  of 
Independent  research.  Therefore,  until  some  standards  are  estab- 
lished and  some  way  is  found  to  minimize  or  distribute  the  costs 
of  data  collection,  research  is  limited  to  the  data  which  are 
currently  available. 

DiCola  observes  that  the  basic  problems  in  scheduling  creative 
resources  are  the  lack  of  reliable  data  and  the  lack  of  understanding 
of  the  interrelationships  among  elements  of  the  data.  Reliable  data 
require  definition  of  the  tasks  that  must  be  performed  to  complete 
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a project;  time  estimates  for  completing  each  task;  a record  of  actual 
man-hours  expended  for  each  task;  and  knowledge  of  the  personnel  re- 
sources. With  this  data  the  actual  man-hours  expended  on  a task  can 
be  compared  with  the  estimates,  and  the  accumulated  data  can  be 
analyzed  [43]. 

The  data  used  In  this  research  were  extracted  from  historical 
tapes  produced  by  a software  development  project  control  system 
which  the  author  managed  and  helped  to  design  over  a period  of  more 
than  two  years.  As  noted  in  Chapter  I,  the  scope  of  this  research 
has  been  purposely  restrained  by  the  known  limitations  of  the 
data  base.  This  research  does  not  seek  to  solve  the  problem  of 
software  development  estimating  but  attempts  to  develop  better 
understanding  of  that  process.  Thus  the  data  which  have  been 

selected  are  appropriate  to  the  objectives  of  this  research.  By 

* 

limiting  observations  to  only  one  set  of  data  from  one  project  con- 
trol system  within  a single  organization  (which  has  a standardized 
management  environment)  the  data  across  various  projects  will  be  as 
consistent  as  is  possible.  Finally,  regarding  the  question  of  data 
reliability  some  compromise  must  be  made.  The  data  were  extracted 
from  a project  control  system  which  was  not  a man-hour  accounting 
system.  Nevertheless,  a fear  of  the  control  system  persisted 
throughout  the  organization  which  Influenced  the  integrity  of 
reporting.  Any  system  which  relies  on  the  integrity  of  individuals 
reporting  on  their  own  productivity  will  always  suffer  to  some 
degree  with  biased.  Inaccurate  data.  Equally  Important  to  any 
evaluation  of  the  results  of  this  research  is  some  knowledge  about 
the  organization  which  developed  these  software  systems  and  the 
internal  project  control  system  which  collected  and  processed  the 
data. 

Organizational  Environment 

The  Air  Force  Data  Systems  Design  Center  (AFDSDC),  at  Gunter 
Air  Force  Station,  Montgomery,  Alabama,  Is  a unique  organization 
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which  Is  responsible  for  the  total  development  (Feasibility  through 
Implementation)  and  continual  maintenance  of  a variety  of  standard 
automated  management  systems  used  by  Air  Force  Installations  through- 
out the  world  (Figure  3 — standard  systems  are  those  used  at  two  or 
more  bases).  These  systems  assist  the  Air  Force  In  managing  civilian 
personnel,  aircraft  maintenance,  hospitals,  supply,  manpower,  etc.; 
almost  every  management  function  in  the  military  service  which  lends 
itself  to  the  efficiency  and  accuracy  of  data  automation.  One 
effect  of  this  variety  of  functions  is  that  approximately  700  soft- 
ware projects  of  varying  magnitudes  are  ongoing  most  of  the  time. 

These  projects  range  from  a few  short  weeks  for  correcting  program 
errors  to  projects  which  may  Involve  more  than  a hundred  people 
over  several  years. 

The  Design  Center  is  organized  into  several  functional  direc- 
torates such  as  Logistics,  Finance  and  Accounting,  and  Medical,  and 
supporting  data  automation  directorates,  such  as  Operations  and 
Data  Processing  Systems  Management  (Figure  3).  The  latter  or- 
ganization includes  functions  such  as  the  development  of  standards 
manuals  for  Air  Force  automation  and  a small  simulation  laboratory. 
The  Operations  Directorate  runs  a closed  shop  program/system  test 
laboratory.  This  directorate  possesses  standard  configurations  of 
hardware  systems  used  throughout  the  Air  Force;  such  as  the 
Burrough's  3500  computer,  the  Unlvac  1050  II,  the  Honeywell  800/200 
system  and  6500  computer.  This  organization  is  also  responsible 
for  all  software  and  documentation  reproduction  and  distribution. 

In  addition,  the  Operations  Directorate  maintains  a unique  24  hour, 
seven-day-a-week  trouble  desk  which  receives  calls  from  around  the 
world  regarding  software  malfunctions. 

The  Design  Center  also  includes  small  staffs  from  the  Office  of 
the  Auditor  General  and  Air  Force  Communications.  These  organiza- 
tions represent  their  specialized  Interests  with  regard  to  the  de- 
sign of  new  management  systems  to  be  implemented  world  wide. 

Directing  the  entire  professional  organization  is  a commander  with 
his  essential  staff  elements  for  budget,  personnel  matters,  etc. 
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In  the  functional  directorates,  functional  system  analysts 
work  side  by  side  with  data  automation  analysts  and  programmers. 

More  than  one  thousand  personnel  are  assigned  to  the  Center.  Per- 
sonnel In  the  Design  Center  are  approximately  60/40  mix  of  civilians 
and  military  respectively.  According  to  a director's  management 
philosophy,  the  automators  and  functional  analysts  might  have 
different  supervisors  of  their  own  Air  Force  speciality  classifi- 
cation. In  contrast,  the  three  types  of  skills  (functional  analyst, 
data  system  analyst  and  programmers)  could  be  placed  under  one  super- 
visor (usually  functional)  using  a project  team  concept.  The 
supporting  data  automation  directorates  are  predominately  staffed  by 
experienced  data  automation  personnel. 

New  systems  requirements  and  major  modifications  to  existing 
systems  are  assigned  by  Headquarters  U.S.  Air  Force.  The  require- 
ments are  then  staffed  by  a team  of  professionals  (l.e.  the  primary 
functional  directorate,  Interfacing  directorates,  the  Auditor, 
Conmunlcatlons,  Operations,  etc.)  at  the  Design  Center.  This  team 
determines  gross  requirements  for  computer  support  and  manpower, 
and  makes  a rough  estimate  of  start  and  completion  dates  according 
to  the  priority  of  the  task.  How  the  proposed  system  Interfaces 
with  other  system  development  efforts  and  existing  systems  Is 
also  examined 

Project  Planning 

The  approved  software  development  project  Is  then  planned  In 
varying  degrees  of  detail.  Factors  Influencing  the  detail  plan 
are  (a)  the  project  manager's  understanding  of  what  Is  required, 

(b)  his  experience  and  management  approach,  and  (3)  the  specific 
phase  of  development  being  planned.  The  outline  for  this  standard 
plan  la  provld''d  by  the  Design  Center's  automated  project  control 
system  called  the  Planning  and  Resource  Management  Information 
System  (PARMIS)  [148].  The  Design  Center  found  It  imperative  to 
Implement  some  standardized  planning  and  project  status  sytem  with 
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over  seven  hundred  software  development  projects  ongoing  at  any  one 
I time.  Only  about  sixty  of  these  projects  Involved  more  than  two 

man-years  of  effort  ((?  2000  hrs. /man-year)  but  consumed  80%  of  the 
I available  professional  resources. 

PARMIS  was  Implemented  In  1968  as  a rather  simple  but  standard- 
I Ized  planning  and  reporting  tool.  The  system  was  little  more  than 

an  Inventory  of  ongoing  or  planned  software  development  projects  at 

(Its  Inception.  By  1972-1974  (the  two  year  historical  data  base 

selected  for  examination  by  this  research)  PARMIS  had  become  well 
entrenched,  and  was  inore  sophisticated  In  terms  of  planning,  tracking 
project  status  and  highlighting  schedule  problems.  Continuing 
limitations  of  the  system  Included  Incomplete  initial  planning,  a 
reluctance  to  update  plans,  and  reporting  Integrity.  Furthermore, 
the  system  has  never  Included  cost  calculations  based  on  the  man- 
hour data  available.  Complete  reporting  of  all  resource  expenditures 
(l.e.  Implementation,  support  hours,  travel,  etc.)  on  a project  is 
an  unrealized  objective. 

The  two  year  period  from  1972-1974  was  selected  because  it  con- 
tained the  planning  history  of  the  most  completed  projects.  Further- 
more, a major  modification  to  PARMIS  prevented  original  estimates 
and  schedules  from  being  removed.  That  Is,  plans  and  schedules  could 
be  changed,  such  that  project  status  was  determined  by  progress 
compared  to  the  latest  plan,  but  original  estimates  and  schedules 
were  retained.  Thus,  this  research  reflects  an  accurate  compari- 
son between  original  estimates  and  the  reported  expended  hours  for 
project  totals,  and  for  each  detailed,  resource  consuming,  activity 
%/hlch  had  been  planned  In  each  software  development  project.  Nelson 
emphasizes  that  if  resources  can  be  measured  as  they  are  expended, 
a more  accurate  cost  history  of  computer  programming  projects  can 
be  compiled  for  use  In  future  research  [110]. 

Project  Size 

The  projects  selected  for  this  research  do  not  represent  large 
system  efforts  such  as  the  Apollo  System,  OS/360,  or  the  SABRE  ^ 
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I system.  Some  large  system  developments  do  occur  at  AFDSDC,  but  only 

' three  of  any  size  (>  20,000  hours)  are  Included  in  this  sample.  The 

sample  primarily  contains  small  to  medium  size  projects,  although 
I many  do  satisfy  some  of  the  criteria  for  large  projects  listed  by 

Aron  (5),  Dennis  [41]  and  McCubbln  (46).  Tslchrltzls  [147]  believes 
I that  there  is  a tendency  to  eliminate  "large"  software  projects  by 

citing  that  only  three  experienced  persons  were  used  to  produce  the 
initial  software  for  the  CDC  Star  computer.  A group  of  three  is 
quite  a departure  from  the  5,000  people  used  for  the  development 
of  OS/360.  It  is  unlikely  that  these  facts  can  be  fairly  compared, 
but  the  difference  is  so  dramatic  the  comparison  is  worthy  of  some 
attention.  The  small,  highly  professional  staff  used  by  IBM  in  their 
"Chief  Programmer  Team"  concept  while  developing  a large,  complex, 
system  for  The  New  York  Times  Is  further  evidence  of  this  tendency 
[100]. 

Control  System  Features 

Malcom  Jones  of  MIT  states  that  perhaps  the  hardest  part  of 
managing  software  projects  is  designing  the  control  system.  The 
system  Is  needed  In  order  to  verify  whether  or  not  the  plan  Is 
working  and  whether  the  project  Is  on  schedule.  Without  a control 
system  It  Is  difficult  to  detect  early  signs  of  trouble  so  that 
schedule  or  specification  slippages  can  be  identified  and  Immediately 
corrected.  "Early  warning  systems,"  although  highly  desirable,  are 
not  easy  to  devise  for  software  projects  [71,  98]. 

The  Design  Center  and  the  PARMIS  system  possess  several  unique 
features  which  enhance  the  data  and  make  them  more  Interesting: 

1.  The  variety  of  functional  management  systems  developed. 

Some  simulation,  utility  software,  and  operating  system 
programming  Is  also  accomplished. 

2.  Various  sizes  of  projects. 

f3.  A common,  all  encompassing,  organizational  structure 

provides  continuity  to  the  overall  management  system. 

I 
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4.  There  is  only  a single  control  system  for  all  users. 
Standard  activity  definitions  for  project  planning  is 
required  for  all  software  development  projects. 

5.  Mandatory  estimating  and  planning  are  required  for  all 
software  development  projects. 

6.  PARMIS  Is  not  a man-hour  accounting  system. 

7.  Training  in  the  objectives  and  use  of  the  system  is 
mandatory  for  all  Design  Center  personnel. 

8.  A "standard"  high  order  language,  COBOL,  is  predominate 
for  functional  software  development  efforts. 

9.  Data  are  collected  as  the  effort  is  expended. 

10.  Data  are  easily  accessible. 

The  limitations  of  the  data  caused  by  the  organizational  environment 
and  PARMIS  itself  Include: 

1.  Data  reliability  caused  by: 

(a)  The  management  environment  and  misuse  of  Che  data 
for  purposes  other  than  for  what  it  was  Intended 
(l.e.  manpower  authorization  reductions)  created 
a dislike  and  distrust  of  the  system. 

(b)  Reporting  integrity  and  discipline  (accuracy  and 
consistency)  are  strongly  influenced  by  (a) . 

(c)  The  limited  and  varied  experience,  planning  and 
estimating  ability  of  project  managers. 

2.  Limited  data  caused  by  a failure  to  plan  the  early 
phases  of  development  in  detail.  Generalized  activi- 
ties are  sometimes  used  to  accumulate  work  effort  and 
disguise  project  progress. 

3.  Non  standard  activities  are  sometimes  included  in  plans 
to  satisfy  different  management  approaches.  While  this 
flexibility  is  a planning  advantage  within  PARMIS  it- 
self, Che  inconsistency  is  a disadvantage  in  research. 

4.  The  data  are  limited  to  one  organization.  The  general 
advantage  of  management  and  organizational  consistency 
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Is  countered  by  the  limitation  of  not  being  able  to 
compare  a cross  section  of  the  industry. 

5.  Planning  complexity  within  PARMIS  was  limited  during 
the  two  years  selected.  The  ability  to  create  ex- 
tensive planning  dependencies  through  PERT  type 
networking  techniques  was  not  available. 

PARMIS 

PARMIS  is  similar  to  most  other  project  control  systems  and  was 
designed  to  minimize  the  effort  required  in  software  development, 
planning  and  reporting.  A new  project  is  initiated  by  establishing 
a new  record  in  the  system  using  only  thirteen  basic  administrative 
data  elements  (i.e.  project  manager's  name,  organization  symbol  and 
routing  Indicator,  project  number,  etc.).  These  initial  data 
elements  must  also  identify  which  of  the  eight  major  development 
phases  will  be  necessary  tor  the  particular  project  along  with  a 
project  complexity  code.  A project  may  involve  anyone  or  all  of 
the  phases.  A problem  in  an  existing  system  might  only  necessitate 
planning  a few  programming  activities.  A manager  could  also 
specify  a particular  development  phase  more  than  once,  which  could 
allow  him  to  devise  a separate  plan  for  every  program  to  be  developed 
or  modified  within  the  project.  The  flexibility  of  the  system  allows 
various  levels  of  planning  detail  to  be  accomplished  for  projects 
of  different  sizes.  For  example,  a program  error  which  might  only 
require  a few  hours  or  days  to  correct  does  not  require  extensive 
detailed  planning.  A small  new  project  could  also  be  established 
and  planned  in  a few  hours  on  a gross  activity  basis  (i.e.  Analysis, 

Progranning,  Documentation,  Testing,  etc.). 

Once  the  project  is  established  in  the  file,  PARMIS  outputs  a 
Planning  Worksheet  for  each  development  phase  requested  that  con- 
tains all  of  the  standard  resource  consuming  activities,  which  might 
be  associated  with  the  phase.  The  worksheet  also  provides  a guide 

for  estimated  hours  for  each  activity  extracted  from  a table  within  • 
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the  system,  based  on  the  gross  project  complexity  code  provided  in 
the  initial  input.  Estimated  man-hours  for  each  activity,  provided 
by  the  Individual  responsible  for  planning  that  phase  or  activity, 
always  override  the  man-hour  guide  provided  by  the  system.  The 
concept  of  "complexity"  is  very  abstract  and  poorly  defined.  The 
system  Intends  to  provide  only  a rough  starting  point  based  on  the 
project  manager's  intuitive  feeling  (complexity)  for  the  effort 
Involved.  These  estimating  guidelines  based  on  complexity  fall 
short  of  being  considered  an  accurate  planning  tool  or  aid.  In 
March,  1974,  the  Design  Center  Operations  Research  Division  con- 
ducted an  analysis  of  the  predictive  capability  of  recorded  ex- 
perience. The  analysis  used  the  "complexity"  code  (there  are  five 
levels  defined)  assigned  by  the  planner  to  each  activity  as  the  key 
Input  parameter.  One  finding  was  that  the  range  of  hours  recorded 
in  PARJOS  history,  at  each  level  of  complexity,  was  too  broad  to 
be  meaningful  for  predictive  purposes  [124,  142].  On  the 
Planning  Worksheet,  the  planner  handscrlbes  estimates  of  the  time 
required  to  do  each  activity  selected.  The  time  Is  estimated  by 
four  categories  of  resource,  namely,  functional  analyst,  data 
system  analyst,  progranuuer  and  support.  Span  days  for  each  activity 
are  also  estimated.  There  Is  not  necessarily  a relationship  between 
man-hours  estimated  and  the  span  days  required,  since  most  per- 
sonnel worked  on  more  than  one  task.  If  available,  other  data 
elements,  such  as  the  name  of  the  Individual  to  do  the  task  are 
also  Included  on  the  worksheet.  Standard  and  non-standard  activities 
can  be  added  and  deleted  giving  the  planner  total  control  of  his 
plan. 

The  completed  plan  Is  Input  to  the  system,  so  that  start  and 
completion  dates  may  be  derived  for  each  activity  within  a phase. 

This  computation  Is  based  on  the  start  dates  provided,  a ten  year 
internal  calendar,  and  the  simple  network  relationships  established 
among  Che  activities.  In  Che  period  1971-74,  network  planning  of 
activities  could  not  be  accomplished  across  development  phases. 
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Dependent  and  Independent  relationships  among  activities  could  only 
be  Identified  within  a phase.  The  plan  Is  processed  between  the 
system  and  the  planner  as  often  as  changes  continue  to  be  made. 

When  the  planner  specifies  that  the  initial  planning  Is  completed 
the  original  plan  is  firmly  established  as  a project. 

Subsequent  updated  plans  and  estimates  can  be  submitted  as  often 
as  necessary,  but  the  original  plan  remains  unchanged  after  It  Is 
accepted  as  a project.  These  original  planning  data  are  retained  for 
future  research,  and  as  a baseline  for  tracking  project  progress. 
Furthermore,  this  approach  should  assure  a conscientious  first  effort 
at  planning,  and  provide  planners  with  an  historical  look  at  estimat- 
ing tendencies. 

Data  collection  and  reporting  become  the  primary  considerations 
after  the  project  Is  started.  PARMIS  also  outputs  a unique  planning 
tool  which  is  most  beneficial  to  the  individual  workers.  Every  week, 
each  Individual  scheduled  to  start  to  work  on,  or  to  complete  an 
activity  in  any  project,  receives  an  Individual  Work  Schedule  List 
(IWSL).  This  form  lists  previously  planned  work  for  each  individual 
to  be  started  or  completed  over  the  next  two  week  period.  The  IWSL 
Is  also  the  document  on  which  dally  annotations  of  expended  effort, 
by  activity,  are  made.  Each  week,  after  the  new  IWSL  is  received, 
the  old  form  Is  turned  Into  the  PARMIS  group  for  keypunching  and 
entry  Into  the  system. 

Various  management  reports  trace  the  status  of  the  projects  based 
on  the  hours  expended  by  each  skill  category  as  compared  to  hours 
estlioated.  Activities  completed  compared  to  their  forecasted  dead- 
line dates  are  another  status  Indicator.  Problem  activities  and 
phases  are  automatically  Identified. 

Another  unique  output  available  from  PARMIS  is  an  eleven  month 
forecast  (limited  by  the  width  of  the  paper,  but  any  eleven  months 
could  be  selected)  of  the  resources  planned  to  be  expended  by  pro- 
ject, by  skill,  in  any  single  month.  This  product  can  be  sorted  In 
a multitude  of  ways.  It  can,  for  example,  be  used  by  an  organization- 
al entity  to  determine  how  their  resources  are  already  committed  for 
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each  month  when  planning  some  new  work.  By  sorting  this  product  by 
name  Ic  shows  how  an  Individual  is  committed  across  all  of  the  pro- 
jects he  is  assigned  to.  This  report  frequently  helps  workers  with 
supervisors  who  have  overcommitted  available  man-hours  In  any  given 
month.  Clearly,  in  these  cases,  some  revised  planning  is  essential 
if  all  of  the  work  is  to  be  completed  on  time. 

Since  PARMIS  is  not  a man-hour  accounting  system,  there  is  no 
delay  in  updating  the  file  with  reported  progress.  The  system  does 
not  have  to  insure  that  everyone  reported  or  that  they  had  reported 
forty  hours.  Reporting  discipline  (reporting  every  week,  on  time 
and  accurately)  is  the  responsibility  of  the  individual  and  the 
project  manager.  Data  not  input  on  time,  failed  to  be  included  in 
standard  summary  reports  for  a specific  reporting  period. 

A telephone  service  is  also  provided  to  PARMIS  users,  for  follow- 
up on  system  identified  deliquent  items,  for  reporting  hours,  for 
identifying  completed  events  prior  to  the  IWSL  being  submitted,  and 
for  updating  plans  and  estimates,  etc.  Each  time  a plan  is  change  , 
the  project  manager  receives  a notice  of  the  change  and  its  Impact 
on  the  schedule.  Presently,  PARMIS  is  experimenting  with  an  on- 
line capability  which  allows  anyone  to  update  the  plan  directly  or 
to  determine  immediately  the  present  status  of  any  part  of  a soft- 
ware development  project  plan.  PARMIS  also  uses  a generalized 
Inquiry  routine  for  special  requests  for  unique  management  summaries 
not  Issued  on  a periodic  basis. 

The  PARMIS  group  consists  of  twelve  personnel.  Three  programmers 
are  used  for  software  maintenance.  There  are  four  operations  person- 
nel , Including  one  keypunch  operator.  These  personnel  assist  project 
«<ittsiier«  with  planning  by  helping  them  to  take  advantage  of  unique 
apahilitles.  The  Operations  section  processes  all  Incoming 
•4  i«  .>p»Titf8  the  telephone  followup  service,  and 

■ >>e  •■purer  tenter  fur  processing.  A three  man 
• > at'  ;tra>p«ret  periodic  briefings  and  special 
• ' . ' t are**  (•r'lt'leaii  tn  the  attention  of 
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project  managers.  A division  chief  and  secretary  complete  the  or- 
ganizations authorized  manpower.  Approximately  twenty  hours  of 
computer  time  were  used  each  month  (prior  to  the  on-line  capacity) 
for  normal  processing,  special  inquiries,  and  testing  of  system 
changes.  The  Design  Center  considers  PARMIS  both  an  operational 
necessity  and  an  experimental  laboratory.  The  Center  has  long 
range  commitments  to  find  a better  way  of  planning,  estimating  and 
managing  software  development. 

Data  Selection 

The  data  used  for  this  research  were  selected  from  the  PARMIS 
history  tapes  of  projects  completed  during  the  years  1972-1974.  A 
week  of  research  at  the  Air  Force  Data  Systems  Design  Center  in 
Montgomery,  Alabama,  obtained  several  different  output  products  from 
the  PARMIS  history  tapes  using  a generalized  inquiry  routine: 

1.  The  first  report  listed  all  projects.  It  reflected  which 
development  phases  had  been  planned  and  the  number  of 
activities  which  had  been  worked  on  in  that  phase.  The 
report  also  provided  phase  and  project  totals  for  the 
hours  expended  by  each  of  the  four  skills,  their  sum, 
and  the  sum  of  hours  estimated.  From  this  listing, 
candidate  projects  were  classified  as  "good"  or  "possible." 
The  twenty  one  projects  identified  as  "good,"  containing 
322b  activity  records,  were  all  used  in  the  research 
because  they  reflected  relatively  complete  development 
projects.  That  is,  each  project  usually  spanned  the 
gamut  of  development  phases  and  each  phase  contained  a 
representative  sample  of  activities.  Only  eighteen 
of  the  twenty  one  "possible"  candidates  were  ultimately 
used,  representing  850  activity  records.  These  are 
projects  with  incomplete  planning  or  some  other  de- 
ficient planning  characteristic.  However,  it  must  be 
remembered  that  the  primary  objective  of  the  research 
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was  to  examine  estimating  accuracy  as  it  related  to 
specific  activities.  An  unexpected  result  would  be 
obtained  from  the  research  if  some  correlation  could 
be  identified  between  activity  estimates  and  total 
project  estimates.  More  than  210  completed  projects, 
available  in  the  data  base,  were  rejected  for  various 
reasons  including: 

(a)  The  project  represented  special  studies.  For 
example,  only  the  Feasibility  phase  was  included. 

(b)  The  projects  only  involved  some  documentation 
changes . 

(c)  The  project  planning  was  incomplete  (l.e.  in- 
volved programming  only).  No  Analysis,  Docu- 
mentation, Testing  or  Implementation  phases 
were  included. 

(d)  Too  few  activities  had  been  planned  within  the 
development  phases. 

(e)  Projects  were  also  rejected  where  it  was  apparent 
project  managers  had  manipulated  the  PAfiMlS 
system  by  assigning  different  project  numbers 

to  each  different  development  phase. 

Elimination  of  a project  did  not  necessarily  exclude  the 
activities  within  the  project  as  candidates  for  the  data 
base  population. 

2.  The  next  product  derived  from  the  data  base  was  a listing 
of  the  detailed  activities  by  assigned  identification 
(ID)  number  which  Included  the  definition  of  the  kind  of 
work  being  accomplished  in  the  activity.  These  activities 
were  grouped  by  development  phase  within  a project.  The 
listing  reflected  the  hours  expended  by  each  of  the  four 
skills,  the  total  expended,  the  total  estimated,  and 
the  numeric  difference.  It  was  decided  not  to  include 
activities  where  an  estimate  was  made  but  where  zero 
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hours  were  expended.  It  Is  believed  that  including 
these  errors  "of  planning  judgement"  would  unneces- 
sarily bias  the  examination  of  estimation  accuracy 
of  activities  which  actually  occurred  (consumed  re- 
sources). A minor  problem  made  itself  apparent  in 
this  listing.  In  numerous  cases  the  planner  had  not 
used  the  standard  activity  ID  number  which  rendered 
useless  the  special  magnetic  tape  of  the  two  year 
historical  data  provided  by  the  AFDSDC.  Fortunately, 
in  most  cases  of  this  type  the  standard  activity 
definition  had  been  used  and  it  was  easy,  though 
manually  time  consuming,  to  convert  the  ID  number 
back  to  its  standard.  There  are  numerous  reasons 
why  planners  changed  standard  ID  numbers.  However, 
a discussion  of  the  rationale  for  such  action  is  not 
pertinent  to  this  study.  Where  an  activity  with  a 
non-standard  ID  and  a non-standard  definition  could 
not  be  clearly  related  to  a standard  activity,  it 
was  discarded.  Project  estimated  and  expended 
totals,  however,  were  not  changed  as  a result  of 
these  kinds  of  inconsistencies  in  the  data. 

Another  decision  made,  as  a result  of  close  examin- 
ation of  this  product,  was  to  eliminate  the  Imple- 
mentation activities  from  consideration.  Attempts 
to  report  and  accumulate  expended  hours  for  Imple- 
mentation activities,  which  involved  the  expenditure 
of  effort  from  many  different  organizational  enti- 
tles, was  very  plebeian  at  that  stage  of  the  develop- 
ment of  PARMIS.  Therefore  the  activities  from  the 
Implementation  Phase  are  considered  too  unreliable  for 
use  in  this  research.  Again,  actual  project  totals 
were  left  intact. 
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3.  The  final  product  obtained  was  a listing  of  all  activi- 
ties grouped  by  activity  ID  number.  This  listing  also 
Included  the  activity  definition  of  the  work  being 
performed  and  all  summarized  hours.  The  1200  indepen- 
dent activities  (unrelated  to  specific  projects 
previously  selected),  which  completed  the  population  of 
records  (@  5250)  to  be  examined  in  the  research,  were 
selected  from  this  listing.  The  population  total  re- 
presents approximately  59%  of  the  activity  records 
available  in  the  original  two  year  data  base. 

Data  Elements 

While  the  PARMIS  data  base  contains  numerous  data  elements  for 

project  and  resource  control,  those  which  will  be  of  primary  concern 

to  this  research  are  as  follows: 

Project  Number:  Correlates  all  activities  under  one  accounting 

number  and  identifies  the  type  of  applica- 
tion (Finance,  Personnel,  etc.). 

Event  Group:  Identifies  the  phase  of  development  (Analy- 

sis, Progranming,  Testing,  etc.). 

Event  Number:  Identifies  the  activity  within  the  phase 

(coding,  typing  documentation  draft,  etc.). 

The  primary  element  to  be  analyzed  in  terms 
of  estimating  accuracy. 

Event  Title:  A standard  definition  of  the  work  being 

accomplished. 

Workload  Category:  A type  of  priority  scheme.  Category  A is 

used  for  programming  maintenance  projects 
(error  correction)  which  will  not  be  con- 
sidered in  this  research.  This  is  the 
highest  priority  of  work  at  the  Design 
Center.  Category  B and  C include  new 
development  projects  and  modifications  to 
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existing  functional  systems,  utilities 
and  operating  system  programs.  It  Is 
from  these  two  priorities  that  the  pro- 
jects and  activities  will  be  selected. 
Program  Action  Code:  This  code  further  classifies  work  under 

the  Workload  Category.  This  research 
sample  will  be  limited  to  new  development 
projects  (code  N) . 

Estimated  Hours:  The  original  estimate  of  man  hours  required 

to  complete  an  activity,  subdivided  by 
Functional  Analyst,  Data  Systems  Analyst, 
Programmer,  Support  and  Total. 

New  Estimated  Hours:  The  latest  estimate  of  time  to  complete  the 

event,  subdivided  as  above. 

Expended  Hours:  The  time  required  to  complete  the  event, 

subdivided  as  above  [148]. 

Throughout  most  of  the  research  an  additional  43  "small"  pro- 
jects were  uniquely  identified.  The  "small"  projects  consisted  of 
375  individual  activity  records.  These  are  projects  classified 
under  an  X35.2  Activity  Group  in  Appendix  1.  The  activities  in 
this  group  (Item  N'jmbers  67-72)  use  only  gross  definitions  for 
planning  (l.e.  KlOO  Feasibility,  K200  Analysis,  K300  Programming, 
K400  Test  and  K500  Documentation) . It  was  decided  that  these  gross 
activity  definitions  were  incompatible  with  the  standard  project 
activities  and  should  not  be  included  in  this  research. 
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CHAPTER  V 

PROCEDURE 


The  Elementary  Variables 

There  is  no  dearth  of  recommended  techniques  and  experiences  for 
attacking  the  symptoms  of  the  estimation  problem  (software  develop- 
ment cost  and  time  overruns).  These  techniques  are  of  value,  but  do 
not  replace  basic  research  where  experience  and  rules  of  thumb  are 
supported  by  measured  results.  Dr.  Boehm  concludes  that  until  a 
firm  data  base  is  established,  the  phrase  "Software  Engineering" 
will  be  a dichotomy  of  terms  and  the  software  component  of  what  is 
now  called  computer  science  will  remain  far  from  Lord  Kelvin's  stand- 
ard: 

When  you  can  measure  what  you  are  speaking  about,  and 
express  it  in  numbers,  you  know  something  about  it: 
but  when  you  cannot  measure  it,  when  you  cannot  ex- 
press it  in  numbers,  your  knowledge  is  of  a meager  and 
unsatisfactory  kind:  it  may  be  the  beginning  of  know- 
ledge but  you  have  scarcely  in  your  thoughts,  advanced 
to  the  stage  of  science.  [13] 

The  elementary  procedure  of  this  research  has  been  to  measure 
the  simple  arithmetic  and  percent  differences  (d^^)  between  the  es- 
timated (E^)  and  observed  expended  (0^)  hours  of  standard  resource 
consuming  activities  (1),  accomplished  during  software  development. 

It  is  suggested  that  the  arithmetic  difference  provides  a variable 
which  represents  magnitude  which  influences  project  duration  and 
cost.  Percent  difference  is  an  Indicator  of  estimating  accuracy, 

d^  ■ Ej^  - a measure  of  magnitude 

d^  ■ (E^  - ® measure  of  accuracy 

Data  Elements 

Subsequent  to  the  process  described  in  Chapter  IV,  Data  Source, 
for  selecting  the  data  base  of  projects,  the  data  were  keypunched 
into  a record  with  the  following  fields: 

I’ 
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Phase  Code 


Activity  Number 


Functional  Analyst  Hours 
Data  System  Analyst  Hours 
Programmer  Hours 
Support  Hours 
Estimated  Hours 

Project  Number  Code 


A single  alphabetic  code  to  identify 
development  phase.  These  codes  are 
the  same  as  used  for  the  six  (6) 
phases  in  PARMIS.  For  example: 

B - Feasibility  study 
D - Programming 

Three  position  numeric  identifier 
which  is  unique  when  associated  with  a 
specific  phase  code.  Same  code  as 
used  in  PARMIS.  Sixty  (60)  unique 
activities  were  used  in  this  study. 

For  example: 

D020  - coding 

Number  of  hours  expended  on  this  ac- 
tivity. Identified  by  skill  desig- 
nation. 

Total  hours  estimated  for  the  activity. 
Not  identified  by  skill. 

Three  position  numeric  identifier  to 
distinguish  among  projects  and  inde- 
pendent activities  as  follows: 

000  - Independent  activities 

001-021  - "good"  projects 
501-518  - "possible"  projects 


Computer  Programs 


The  first  computer  program  accepted  the  above  data,  edited,  and 
summarized  each  record  both  as  an  individual  activity  record  in  the 
total  population  (including  independent  records)  and  as  an  activity 
within  a specified  project. 

The  following  summaries  and  computations  were  made  for  each 
unique  activity  (60  each)  and  development  pVa'^  (6  each)  for  both  the 
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1.  population  (all  activity  records),  and 

2,  separate  project  totals  (activity  records  within  a speci- 
fic project) : 

(a)  Total  expended  hours  by  skill, 

(b)  Total  expended  hours  - sum  of  the  four  skills, 

(c)  Total  number  of  occurrences, 

(d)  Number  of  times  underestimated, 

(e)  Number  of  times  overestimated, 

(f ) Number  of  times  estimated  and  expended  hours  exactly 

equal, 

(g)  Total  number  of  hours  underestimated, 

(h)  Total  number  of  hours  overestimated, 

(i)  Absolute  sum  of  the  arithmetic  differences 


(j)  Sum  of  the  percent  differences 
N 


Id  - I (E  - 0 )/0. 
^1-1  ^ 


(k)  Mean  of  the  arithmetic  (and  percent)  differences 

N N 

I -i  I \ 

<>1  - . and  d;  • , 

(l)  Standard  deviation  of  the  arithmetic  (and  percent) 
differences 


SD  - 


In  addition  to  the  sumpary  listing,  the  above  routine  also  pro- 
vided punched  output  for  each  unique  activity  in  each  project,  for 
subsequent  processing.  This  output  Included: 

1.  Phase  code  and  activity  number, 

2.  The  arithmetic  difference  of  each  activity  record  in  a 
project. 
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3.  The  percent  difference  of  each  activity  record  In  a 
project,  and 

4.  Project  number. 

The  purpose  of  a second  routine  was  to  "score"  (0  or  1)  each 
unique  activity,  within  each  project,  for  subsequent  use  in  SEQUIN 
(Sequential  Item  Analysis  Routine)  [118],  based  on  whether  specified 
criteria  were  satisfied  (1)  or  not  (0).  Two  criteria  were  used. 

One  criterion  (x)  was  used  to  determine  if  the  difference  (d,  arith- 
metic or  percent),  of  a specific  activity  (i) , fell  within  some 
percentage  range  of  the  standard  deviation  (SD)  computed  for  that 
activity.  Several  test  values  were  selected  for  this  criterion 
(i.e.  X • 0.1,  0.25,  1.0,  1.25,  1.5  and  2.0).  The  relationship  be- 
tween the  variable  pairs  (difference,  d^,  and  standard  deviation, 

SD^)  is  as  follows: 

(1)  d^<  X . SD^ 

The  relationship  was  applied  to  the  arithmetic  difference  and  cor- 
responding standard  deviation  pair  for  each  activity  in  each  project. 
The  percent  difference  and  its  corresponding  standard  deviation  also 
formed  a variable  pair  against  which  the  relationship  was  applied. 

The  second  criterion  was  based  on  whether  some  percentage  (50%  or 
75%)  of  the  unique  (same  phase  and  activity  code)  activities,  within 
a specific  project,  satisfied  the  criterion  in  expression  (1).  This 
latter  criterion  was  specified  because  most  projects  had  multiple 
unique  activities,  while  SEQUIN  allows  only  one  "score"  (0  or  1)  per 
"score  card"  for  each  unique  activity  or  phase.  For  example, 
activity  D020,  Coding,  would  be  individually  estimated  for  every 
separate  program  in  the  project.  Analysis  activities  might  have 
multiple  occurrences  where  different  personnel  worked  on  separate 
parts  of  the  activity.  Furthermore,  this  criterion  permitted  some 
reasonable  flexibility  in  examining  estimating  accuracy. 

Twenty-four  (24)  individual  "score  cards"  were  output  for  each 
project  for  subsequent  input  to  SEQUIN.  This  was  a result  of  using 
the  six  (6)  variable  modifiers  (x)  of  the  standard  deviation  in 
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I conjunction  with  both  the  arithmetic  and  percent  differences,  in 

* addition  to  the  requirement  that  only  50%  or  75%  of  the  unique  ac- 

I tivity  differences  satisfy  expression  (1).  A one  (1)  or  a zero  (0) 

I was  placed  in  each  of  72  columns  (one  for  each  of  the  60  activities 

and  6 phases)  in  each  "score  card"  based  on  whether  the  two  conditions 
I or  criteria  (expression  (1) , and  some  percentage  of  activities  satis- 

fying that  expression)  were  satisfied.  Recall  from  page  75  that 
I activities  67-72  were  excluded  from  the  research. 

In  particular,  a "score"  (0  or  1)  was  obtained  by  comparing  the 
f arithmetic  difference  (d^),  for  each  activity  in  a project  to  some 

percentage  (criterion  x)  of  the  corresponding  standard  deviation  (SD) 
r computed  for  that  unique  activity  from  the  total  population.  Each 

time  a unique  activity,  within  a specific  project,  satisfied  the 
criterion  in  expression  (1)  a counter  (k)  was  Increased  by  one.  When 
all  project  activities  were  processed  the  value  for  each  counter  (k) 
was  then  divided  by  the  total  number  of  occurrences  of  each  unique 
activity  (M)  within  the  project.  If  the  dividend  was  equal  to  or 
greater  than  50%  then  the  appropriate  activity  column  for  one  of 
the  "score  cards"  for  that  project  received  a one  (1).  Another  "score 
card,"  for  each  project,  was  output  with  appropriate  "scores"  based  on 
whether  the  result  of  a counter  divided  by  the  total  number  of  occur- 
rences of  a unique  activity  equaled  or  exceeded  75%.  That  is, 
di,k  < X . SD^ 

1 “ a unique  actlvlty>  i » 1 (1)  N, 

k ~ a dummy  index  to  indicate  the  number  of 

multiple  occurrences  of  a unique  activity 
within  a project.  Os  k s M, 

M “ the  number  of  multiple  occurrences  of  a 
unique  activity  In  a project, 

N “ the  number  of  unique  activities  in  a 
project,  N ^ 60, 

score  “ 1 

M 

I dj  ,,  2 0.5  • H 

k-i 


when 

where 

I 

then 

if 


81 


M 

or  I 1,  - 0*75  • M 

k=l 


otherwise  score  = 0 . 

Similar  "score  cards"  were  created  using  the  percent  difference  and 

Its  corresponding  standard  deviation  In  expression  (1).  Phases  were 

"scored"  by  comparing  the  sum  of  the  arithmetic  (and  percent) 

differences  ( d^)  for  all  the  activities  in  that  phase,  of  that 

project,  to  the  corresponding  phase  standard  deviation  computed  from 

the  total  population.  In  this  case, 

where  J » number  of  activities  in  a unique  phase  in 
a project 

p » a unique  phase 
then  score  = 1 


J 

if  F d < X • SD 

1=1  ^ P 


otherwise  score  = 0 , 

The  above  algorithms  were  repeated  for  each  project. 


The  SEQUIN  Parameter  Selection  Program 

SEQUIN  uses  two  types  of  variables.  These  are  described  as 
either  "internal"  or  "external."  Internal  variables  are  values 
created  by  SEQUIN  from  the  input  "scores"  (0  to  1)  provided.  These 
"internal  scores"  are  essentially  totals  (across  all  projects)  of 
the  input  scores  for  each  unique  activity  within  each  project  "score 
card."  External  variables  are  values  which  are  not  (necessarily) 
functions  of  the  input  "scores"  and  which  are  input  as  separate 
data.  Examples  of  external  variables  used  include: 

1.  Total  hours  expended  on  each  project  (including  expended 
hours  on  activities  within  the  project  which  were  not 
used,  i.e.  non-standard). 

2.  Total  hours  estimated  for  each  project  (including 
estimated  hours  for  non-standard  activities  noted  above, 
but  excluding  activities  with  estimated  hours  and  zero 
expended  hours). 
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3.  The  absolute  arithmetic  difference  between  1 and  2 above. 

4.  The  percent  difference  between  1 and  2 above,  expressed 
as  an  integer. 

5.  The  inverse  of  the  arithmetic  difference  between 

the  total  hours  estimated  and  expended  for  each  project 
expressed  as  an  integer  (all  activities  were  included 
without  exception). 

The  objective  of  SEQUIN  is  to  Identify  parameters  which,  in 
general,  optimize  some  function  (called  the  objective  function)  of 
validity  (correlation),  internal  consistency  reliability,  and  per- 
haps other  variables.  This  research  is  seeking  to  isolate,  through 
SEQUIN,  parameters  (activities)  which  are  critical  indicators  of 
whether  a project  has  been  estimated  well.  That  is,  if  the  estimates 
for  the  pool  of  best  predictor  activities,  selected  by  SEQUIN,  are 
accurate  there  should  be  some  confidence  that  the  overall  set  of 
activities  is  estimated  wel]..  SEQUIN  also  attempts  to  identify 
a correlation,  if  one  exists,  between  internal  and  external  variables. 
For  example,  SEQUIN  has  been  used  successfully,  to  reduce  the  number 
of  questions  on  a test,  by  identifying  those  questions  which  most 
consistently  contributed  to  discovering  what  the  test  was  attempting 
to  learn  about  an  individual's  knowledge  in  some  area.  By  using 
the  appropriate  SAT  scores  as  external  variables  SEQUIN  was  able  to 
find  a correlation  between  the  test  score,  using  the  reduced  set 
of  questions,  and  the  results  from  the  SAT. 

The  problem  of  isolating  the  consistently  most  Influential 
activities  can  be  attacked  by  a variety  of  optimization  techniques. 

One  practical  way  of  proceeding,  and  the  method  used  by  SEQUIN,  is 
to  use  a strategy  similar  to,  but  arithmetically  less  complicated 
than,  the  accretion  procedure  of  statistical  multiple  regression 
analysis.  This  procedure  is  sub-optimal  in  the  sense  that  it  does 
not  necessarily  produce  optimal  solutions.  Experience  has  shown, 
however,  that  SEQUIN  determines  solutions  which  are  significant 
Improvements  over  those  derived  by  human  examination  of  basic  data 
[118]. 
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Having  pointed  out  that  SEQUIN  utilizes  a searching  procedure 
that  is  not  optimal,  but  can  usually  be  expected  to  be  practically 
useful,  the  specific  method  is  now  characterized.  SEQUIN  first 
constructs  from  the  parameter  responses  (activity  "scores")  based 
on  the  criterion  used  (whether  arithmetic  or  percent  differences  and 
the  percent  of  the  SD) , a basic  semi-matrix  of  the  form  shown  in 
Figure  4.  Then  for  each  project  "score  card"  separately,  the  program 
selects  parameters  (activities)  in  such  a manner  as  to  increase  the 
validity  (correlation)  of  the  accumulated  Information  result  (internal 
score)  created  by  the  parameter  selection  process.  This  selection 
"evaluation"  is  the  iterative  accretion  process  which  will  build  the 
list  of  the  most  predictive  activities  using  the  cumulative  validity 
at  each  stage  of  the  process.  For  example,  the  activity  is  first 
selected  which  had  the  highest  validity  (internal  score)  with  the 
original  criteria.  Next  another  parameter  is  selected  such  that,  when 
combined  with  the  first  selected  parameter  the  two  produce  "a  two 
parameter  list  of  indicative  (or  predictive)  activities"  with  the 
greatest  validity.  Having  found  that  pair,  a new  third  parameter 
is  sought  which,  when  combined  with  the  previously  selected  pair, 
produces  a three  "parameter  set"  with  maximum  validity.  This  process 
is  repeated  until  all  parameters  (activities)  are  used. 

It  can  easily  be  shown  that  for  I parameters  SEQUIN  will 
Investigate 

1=1-1 

(2)  I C,  - 1(1  + l)/2 
i»0  ^ 

combinations  of  parameters.  Even  for  I = 30  this  requires  searching 
465  combinations  whereas  a complete  survey  requires  over  a billion 
combinations  to  be  examined.  The  price  paid  for  this  strategy  is 
the  insecurity  of  not  knowing  the  true  optimal  in  terms  of  pre- 
dictive capability  of  each  activity  nor  how  close  the  results  come 
to  the  true  optima. 
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The  SEQUIN  Objective  Function 


The  objective  function  used  In  the  SEQUIN  program  Is  simply  a 
function  of  the  difference  of  the  validity  of  "evaluat lon^' con- 
structed from  k and  k-1  parameters  (activities).  In  other  words  the 
objective  function  Is  expressed  as  the  difference  of  product-moment 
correlations  of  the  "evaluations" and  the  predictor  variable  (l.e., 
Internal  or  external  variable). 

Define  the  "scored"  response  (0  or  1)  made  In  the  s^^  project 
to  the  1^^  activity  as  X(l,s).  Then  the  "evaluation"  score  (T,  or 
"Internal  score")  of  the  "score  card"  of  the  s^^  project  made  up 
of  responses  to  k parameters  can  be  expressed  as; 


(3)  T(b.k)  - I X(l,s).  s-l(l)S. 

1-1 

T Is  a measure  of  how  jften  activity  estimates  for  a project  were 
made  correctly  (l.e.,  within  the  relationships  established  in  logical 
expression  (1)  and  the  percent  criterion).  Designate  the  predictor 
variable  associated  with  the  s^^  project  as  y(s).  Also  define: 

V(k]  » variance  of  the  T(s,k), 

C[y,k]  - covariance  of  the  y(s)  and  T(s,k),  and 
V[yl  - variance  of  the  y(s). 

The  objective  function,  0F[kl  can  be  expressed  as  the  differ- 
ence between  the  validity  function  for  k activities  and  the  validity 
function  for  k-1  activities.  Then  0F[k]  Is  written  as: 

<‘)  “""‘I  ■ /V|yi'':'‘!ltl  - m-ll  • 

Since  V[y]  Is  Invariant  with  respect  to  the  number  of  activities  In 
the  "evaluation"  it  can  be  Ignored  In  the  computation,  l.e.  V[y]  - 1 
In  formula  (4).  Thus, 


(5)  OFlk] 


. C[y.k-ll 

/vOo  " .A^lk-l) 


The  computational  problem  Is  therefore  reduced  to  determining  the 
numerator  and  denominator  of  the  terms  on  the  right  of  equation  (5). 
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It  Is  convenient  to  express  C(y,k]  as  a function  of  C(y,k-ll,  ^nd 
V(k]  as  a function  of  V(k-1].  This  Is  easily  achieved  since, 

(6)  Cly.kl  - C[y,k-ll  + C(y.l^J, 

where  C(y,lj^]  Is  the  covariance  of  the  predictor  variable  y and  the 
responses  (0  or  1)  to  the  parameters  (activities)  chosen  at  the  k^^ 
stage  In  the  sequence  (l.e.,  the  k*^  iteration  of  the  accretion 
process  using  "evaluation  scores").  A.lso 

k-1 

(7)  V(k)  - V[k-1]  + V(1  ] + 2 ^ C(l.,l,) 

j-1  J 

where  Is  the  variance  associated  with  the  responses  (activity 

scores)  to  the  parameter  (activity)  at  stage  k and 

k-1 

I C(1  1 ) 

j-1  ^ 

Is  the  sum  of  covariance  of  the  responses  to  the  parameter  response 
chosen  at  stage  k with  the  responses  to  each  of  the  other  parameters 
In  the  "evaluation"  at  stage  k-1.  The  functions  are  easily  computed 
from  Information  available  In  sections  A and  C of  the  semi-matrix 
of  Figure  4 (page  84). 

The  fundamental  strategy  of  SEQUIN  Is  to  compute  0F[k]  for  each 
parameter  response  not  in  the  "evaluation"  at  stage  k-1  using 
formulas  (5),  (6)  and  (7)  and  find  that  parameter  (activity)  for 
which  0F[k]  Is  a maximum.  This  parameter  then  becomes  part  of  the 
"evaluation"  (i.e.,  candidate  pool  of  the  most  predictive  activities) 
and  the  next  stage  Is  entered.  This  cycle  Is  repeated  until  the 
parameters  available  are  exhausted.  During  this  process,  the 
elements  of  sections  A and  C of  Figure  4 (page  84)  need  not  be  trans- 
formed In  any  way  as  the  computational  process  proceeds.  Thus, 
although  the  procedure  for  seeking  the  best  solution  Is  similar  to 
the  accretion  procedure  of  multiple  regression  analysis,  the  com- 
putations are  greatly  reduced  [118]. 
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SEQUIN  Output 

The  primary  output  from  SEQUIN  is  a table.  A sample  computer 
output  from  this  research  Is  displayed  In  Appendix  3.  The  Sequence 
Number  specifies  the  number  and  order  of  activities  In  the  "evalua- 
tion" (which  make  some  contribution  In  predicting  overall  estimating 
accuracy  on  the  projects)  created  by  the  parameter  selection  process; 
those  activities  making  no  contribution  (l.e.,  with  Item  Difficulty 
of  < 0.0005)  having  already  been  eliminated  by  SEQUIN  (see  Appendix 
3,  page  173).  The  Criterion  Number  specifies  the  number  of  the 
external  or  Internal  criterion  for  which  the  table  Is  appropriate. 

The  Item  Number  of  the  selected  activity  Is  specified  together  with 
that  parameter's  Item  Difficulty,  Point  Biserial  and  Blserlal 
Correlation  with  the  criterion  being  analyzed  from  each  iteration 
of  the  process. 

The  title  "CUM  CORK"  stands  for  cumulative  correlation  %rhlch  is 
equivalent  to  the  "validity"  between  the  "evaluation"  created  by  the 
parameter  selection  process.  The  ordering  of  activities  Is  a 
function  of  the  parameter  selection  process  which  generates  this 
value.  This  column  Is  frequently  the  one  of  greatest  Interest.  The 
change  from  the  previous  Iteration  of  the  selection  process  Is  given 
In  the  column  marked  "CORK  CHANGE." 

The  Internal  Consistency  reliability  of  the  "evaluation"  Is 
provided.  This  number  is  equivalent  to  that  derived  from  a "Hoyt"  or 
"Kuder-Richardson  Formula  20"  analysis.  The  change  in  the  reliability 
coefficient  Is  also  given.  The  "evaluation"  Mean,  Standard  Deviation 
and  Standard  Error  of  Measurement  are  provided.  Joint  Reval  Is  the 
cumulative  correlation  multiplied  by  the  Internal  consistency.  Reval 
Change  Is  the  Internal  consistency  reliability  divided  by  the  validity 
change  (fr<'  'urrent  and  last  stage). 


CHAPTER  VI 


FINDINGS 

Quantitative  Versus  Qualitative  Analysis 

A considerable  amount  of  quantitative  information,  derived  from 
this  research,  is  displayed  in  this  chapter.  While  these  classifi- 
cations and  analyses  provide  some  valuable  insights  into  the  software 
development  process,  they  should  not  obfuscate  the  need  for  quali- 
tative interpretations  as  well. 

A guiding  principle  of  this  research  has  been  to  seek  only  what 
information  the  limited  data  base  could  provide.  Analytical  obser- 
vations should  not  exceed  the  known  constraints  of  a data  sourc''. 

The  author's  considerable  knowledge  of  and  experience  with,  the 
AFDSDC,  the  PARMIS  system  and  the  data  base  have  restrained  any 
inclination  to  draw  firm  general  conclusions  regarding  software  de- 
velopment throughout  the  industry.  Hopefully  this  elementary  effort 
will  provoke  the  beginning  of  new  studies  into  the  improvement  of 
software  development  management,  planning,  estimation  and  control. 

This  study  Intends  no  criticism  of  the  U.S.  Air  Force  Data 
Systems  Design  Center  management  nor  of  its  policies  and  procedures. 
On  the  contrary,  the  Design  Center's  commitment  to  software  develop- 
ment project  control,  the  continual  experimental  improvements  to  the 
PARMIS  system  and  the  Center's  willingness  to  share  the  PARMIS  data 
for  research,  are  indicative  of  the  Air  Force's  concern  with  the 
software  estimation  problem.  The  lack  of  software  development  es- 
timating expertise  is  not  unique  to  the  government. 

Quantitative  Display  of  the  Findings 

The  summarized  data  pertaining  to  the  selected  population  of 
activities  are  contained  in  Table  4.  (Definitions  of  the  activities 
are  contained  in  Appendix  2.)  Totals  for  the  columns  titled  "Number 
Underestimated"  and  "Number  Overestimated"  were  simply  computed  by 
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subtracting  the  hours  expended  from  the  hours  estimated  for  each 
occurrence  of  tiie  activity.  A negative  difference,  therefore,  corres- 
ponded to  an  underestimate.  Skill  abbreviations  are  defined  as 
follows : 

FA  - Functional  analyst.  A specialist  in  the  various  functional 
career  fields  (Personnel,  Finance,  etc.)  in  the  Air  Force. 
Responsible  to  define  user  requirements,  for  test  evalua- 
tion, and  for  preparation  of  user  documentation. 

DA  - Data  Systems  Analyst.  A specified  Air  Force  career  field 
or  a senior  programmer.  Responsible  for  system  analysis, 
design,  testing  and  documentation. 

PG  - Programmer.  A specified  Air  Force  career  field.  Respon- 
sible for  detailed  program  design,  coding  and  checkout. 

SP  - Support  personnel.  This  classification  of  consumed  hours 
is  very  broad  and  may  represent  keypunch,  typist,  manage- 
ment, computer  operator,  and  a variety  of  other  types  of 
support  personnel. 

Computations  of  the  mean  and  standard  deviation  for  the  arithmetic 
difference  between  the  estimate  and  expenditure,  represent  the 
activities  tendency  to  affect  project  duration.  Since  the  mean  was 
computed  using  the  absolute  value  of  the  difference,  it  cannot  be 
presumed  that  the  duration  is  always  extended.  The  mean  and  stan- 
dard deviation  of  the  percent  difference  provide  a representation 
of  the  estimating  accuracy  (or  more  precisely  the  inaccuracy) 
associated  with  each  activity.  Figures  5 through  18  are  the  com- 
parative (graphical)  representations  of  the  data  in  Table  4. 

Information  pertinent  to  the  distribution  of  activities,  skills 
and  hours  within  each  project  is  summarized  in  Table  3.  A compari- 
son is  provided  between  the  "Estimated/Expended  Totals  of  Selected 
Activities"  from  a project  for  the  population  and  the  "Estimated/ 
Expended  Totals  of  Actual  Projects"  from  the  historical  data  base. 

This  comparison  provides  some  Indication  of  the  amount  of  data  which 
had  to  be  discarded  for  the  various  reasons  stated  in  Chapter  IV, 

Data  Source.  In  deference  to  the  many  "rules  of  thumb"  regarding 
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Fig.  10.  Mean  and  standard  deviation  of  percentage  difference /phase. 
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Fig.  18.  Hours/sklll/phase. 
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the  distribution  of  resources  in  software  development,  the  columns 
titled  "Rules  of  Thumb  Percent  of  Totals  Expended"  were  added. 

These  figures  were  arrived  at  by  slightly  reclassifying  activities 
in  the  following  way: 

Analysis  - sum  of  phases  B and  C 

Programming  - sum  of  activities  DOlO 

D020 

D030 

D040 

D050 

Testing  - sum  of  activities  D071 

D090 

D120  and  Phase  E 

Documentation  - sum  of  activity  D130  and  Phases  F and  G 
Figures  19  through  26  are  the  comparative  (graphical)  representations 
of  the  data  In  Table  5. 

The  number  of  unique  activities  - phases  scored  per  project,  in 
relation  to  the  criteria  specified,  are  depicted  In  Tables  6 and  7. 

Tables  8 and  9 contain  the  most  pertinent  data  extracted  from 
the  24  outputs  from  SEQUIN.  (A  complete  sample  output  from  one  of 
the  SEQUIN  runs  used  by  this  research  is  contained  In  Appendix  3.) 
These  two  tables  represent  changes  in  the  scoring  criterion  which 
specified: 

(a)  a unique  activity  within  a project  is  scored  1 If 

at  least  half  (1/2)  of  those  activities  (with  identi- 
cal activity  codes)  fall  within  the  range  of  some 
variable  (x)  times  the  standard  deviation  (SD) . 

Refer  to  algorithm  on  page  81. 

(b)  a unique  activity  within  a project  is  scored  1 if 
at  least  three  fourths  (3/4)  of  those  activities 
(with  identical  activity  codes)  fall  within  the 
range  of  some  variable  (x)  times  the  standard  devia- 
tion (SD) . Refer  to  algorithm  on  page  81. 

The  columns  on  these  tables  correspond  to  the  SEQUIN  output  resulting 
from  changes  to  the  variable  multiplier  used  to  adjust  the  range  of 
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Fig.  21.  Percentage  progranulng  (Phase  D)/proJect  versus  rule  of  thumb  percentage  pro- 
graaalng/project. 
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Fig.  22.  Percentage  testing  (Phase  E) /project  versus  rule  of  thumb  percentage  testing/ 
project . 
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Fig.  26.  Percentage  data  analyst  sktll/project . 
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Table  9.  Sumnarlzed  SEQUIN  Results 

? '4  of  Unique  Activities/Project  Within  x • SD  (Scored  1) 
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standard  deviation.  The  "Intt-rrr Iterlon  Correlation"  is  displayed 
representing;  the  correlation  between  the  Internal  variable  (Evaluation 
Mean)  and  the  external  variable  (Inverse  of  the  arithmetic  difference 
or  total  estimated  - total  expended  for  the  entire  project).  The 
"Evaluation  Mean"  Is  the  mean  of  the  number  of  unique  activities  per 
project  with  a score  of  one  (1). 

The  items  (activities)  selected  by  SEQUIN  as  being  the  best  pre- 
dictors of  the  Internal  variable  are  contained  in  the  upper  left  hand 
corner  of  the  boxes  which  make  up  the  body  of  the  Tables  8 and  9. 

The  internal  variable  is  itself  an  Indicator  of  estimating  accuracy 
based  on  the  criteria  used  for  scoring.  These  same  Item  Numbers 
appear  sequentially  on  the  SEQUIN  output  and  were  selected  down  to  a 
point  where  the  correlation  change  became  very  small.  The  number 
directly  below  the  Item  Number  is  its  Item  Difficulty.  Item 
Difficulty  is  a measure  of  how  often  an  activity  has  a score  of  one 
(1)  throughout  the  population  of  projects.  The  number  in  the  upper 
right  corner  of  the  box  represents  the  Correlation  Change  extracted 
from  the  SEQUIN  output.  This  number  represents  the  amount  of  pre- 
dictive correlation  (validity)  that  the  activity  contributes  toward 
predicting  the  internal  variable.  The  "Cumulative  Correlation"  Is 
the  sum  of  the  correlation  changes  within  the  column.  The  row 
titled  "Number  of  Items"  simply  represents  the  number  of  activities 
selected  sequentially  from  the  SEQUIN  output  to  obtain  the  cumulative 
correlation  noted.  The  second  number  is  the  total  number  of  items 
identified  by  SEQUIN  as  having  any  predictive  capability.  Activities 
having  an  item  difficulty  less  than  0.005  were  excluded  from  con- 
sideration by  SEQUIN  (see  Appendix  3,  page  173  for  Items  not  con- 
sidered in  that  sample  run). 

Table  10  summarizes  from  Tables  8 and  9 the  number  of  occurrences 
of  the  must  indicative  activities  selected  by  SEQUIN.  Activities 
appearing  less  than  five  times  were  not  Included. 
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Table  10.  Number  of  Occurrences  of  Most  Predictive  Activities 


Itan 

Number 

1/2  Within  X • SD 

3/4  Within  X • SD 

Total 

Occurs 

First 

Arith- 

metic 

Per- 

centage 

Arith- 

metic 

Per- 

centage 

16 

6 

6 

6 

6 

24 

18 

26 

5 

6 

2 

5 

18 

4 

15 

3 

4 

3 

6 

16 

2 

61 

4 

4 

4 

3 

15 

39 

0 

6 

3 

2 

11 

30 

1 

5 

1 

3 

10 

6 

1 

2 

2 

4 

9 

40 

3 

1 

3 

2 

9 

11 

1 

0 

2 

5 

8 

12 

1 

1 

2 

4 

8 

13 

3 

3 

2 

0 

8 

17 

4 

2 

2 

0 

8 

34 

3 

1 

3 

1 

8 

46 

0 

4 

1 

3 

8 

64 

3 

1 

1 

3 

8 

7 

3 

1 

1 

2 

7 

45 

1 

2 

2 

2 

7 

52 

2 

0 

3 

2 

7 

18 

1 

2 

3 

0 

6 

44 

0 

3 

0 

3 

6 

53 

2 

0 

3 

1 

6 

62 

3 

1 

2 

0 

6 

8 

0 

1 

2 

2 

5 

19 

1 

0 

2 

2 

5 

47 

0 

0 

3 

2 

5 

63 

2 

0 

2 

1 

5 
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Qualitative  Analysis 

The  minimum  of  effort  and  planning  accomplished  in  the 
Feasibility  phase  (B)  is  inmediately  apparent  from  Tables  4 and  5 
(pp.  89  and  108)  and  Figures  5 and  6 (pp.  96  and  97).  The  mean 
and  standard  deviation  of  the  differences  for  this  phase  (Figures  7 
through  10,  pp.  98-101)  are  extraordinarily  variant  as  might  be  ex- 
pected from  the  broad  definitions  of  the  activities  and  the  nature 
of  the  work  being  accomplished.  Control  over  this  type  work  is 
rarely  very  strict  since  several  minor  tasks  associated  with  the 
generalized  activities  are  usually  accomplished  by  several  organi- 
zational entitles  outside  of  the  development  organization.  Activity 
COlO,  Define  System  Concepts,  is  a generalized  task  closely  related 
to  this  work  and,  as  can  be  seen  (Figure  9,  p.  100),  also  has  an 
extreme  variance  In  estimating  accuracy.  Defining  the  problem  care- 
fully is  critical  to  solving  a problem.  Therefore,  the  subsequent 
cyclic  development,  software  reliability,  testing  demands,  and 
schedule  and  cost  overruns,  may  be  directly  related  to  the  ill 
defined  work,  or  lack  thereof,  in  phase  B (113,  90]. 

Estimating  accuracy  (Figures  9 and  10,  pp.  100  and  101)  can 
generally  be  said  to  be  very  bad  and  inconsistent.  However,  with  a 
few  exceptions,  the  activities  in  the  Analysis  phase  (C)  generally 
have  smaller  means  and  standard  deviations  for  the  analytical  type 
of  work  being  accomplished.  Furthermore,  this  observation  is 
supported  by  Figures  11  and  12  (pp.  102  and  103)  "Hours  underestimated 
versus  hours  overestimated."  The  number  (17)  of  activities  in  the 
analysis  phase  contributes  to  a more  precise  definition  of  the  work, 
making  it  easier  tj  estimate  and  control  projects. 

The  literature  identifies  underestimating  as  the  predominant 
problem  in  estimating  accuracy  (133,  23,  72).  This  study  reflects, 
contrary  to  the  literature,  that  overestimates  are  as  common  as 
underestimates.  However,  in  several  instances  (e.g.,  critical  activi- 
ties CISO,  EOlO,  E020,  E070)  the  hours  underestimated  are  more  sig- 
nificant than  the  hours  overestimated,  while  the  number  of  activity 
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occurrences  are  equal  or  In  an  inverse  proportion.  This  phenomenon 
i.s  even  more  evident  In  the  Testing  phase  (E)  in  contrast  to  all 
other  phases  (Figures  12  and  13,  pp.  103  and  104).  Clearly,  purpose- 
ly overestimating  can  obscure  poor  estimating  where  resource  expendi- 
tures actually  approach  the  high  estimate.  This  could,  of  course, 
be  argued  as  being  accurate  estimating.  This  might  be  possible, 
except  for  the  predominance  of  overestimates  in  this  data  base,  and 
the  wild  deviations  in  estimating  accuracy.  Gross  overestimates  are 
more  likely  to  be  accepted  in  government  agencies  as  opposed  to 
commercial  activities.  There  is  no  competitive  bidding  for  systems 
development  among  government  agencies,  and  as  yet,  no  one  to 
seriously  challenge  the  cost.  Software  development  in  the  government 
has  not  yet  experienced  an  Ernest  A.  Fitzgerald  who  achieved  national 
attention  from  exposure  of  cost  overruns  on  the  C5-A  military  trans- 
port aircraft  development.  Furthermore,  manpower  authorizations  in 
Air  Force  software  development  agencies  are  occasionally  justified 
by  the  estimated  workload  figures  for  proposed  systems  development 
projects.  Recall,  however,  that  the  Design  Center  develops  standard 
systems  used  at  many  Air  Force  installations  throughout  the  world. 
This  multiplicity  of  system  usage  has  considerable  potential  for 
amortizing  the  system  cost  and  for  magnifying  savings  and  efficien- 
cies. 

The  predominance  of  programming  activities  does  not  seem  un- 
usual (Figures  5,  6,  11,  12,  13,  14,  pp.  96,  97,  102-105).  As 
systems  become  better  defined  the  work  is  more  easily  subdivided 
into  discrete  tasks  (programs)  each  having  its  own  set  of  activities. 
The  literature  implies  that  the  programming  phase  is  the  most  under- 
stood, disciplined  and  quantifiable  (138).  Therefore,  it  is  surpris- 
ing that  estimates  for  the  activities  of  Coding  (D020)  and  Program 
Test  (D090)  are  so  grossly  Inaccurate  (Figures  9 and  10,  pp.  99  and 
100),  as  is  much  of  the  program  development  phase  (D) . A possible 
explanation  for  this  observation  is  that  the  data  system  specifi- 
cations for  programs  at  the  AFDSDC  were  never  well  written,  if 
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written  at  all,  during  the  early  years  of  that  organization's  exis- 
tence (1969-1974).  Rarely  did  the  specifications  adhere  to  the 
standards  prescribed  for  specification  writing.  Perhaps  this  flaw 
in  professional  software  development  can  explain  the  estimating 
variance  In  these  activities.  Most  writers,  including  Dijkstra, 
emphasize  the  critical  Importance  of  detailed  specifications  for 
predicting  and  controlling  program  development  [106,  27,  97]. 
Requirement  changes  have  been  identified  as  the  most  common  problem 
in  the  development  cycle  which  have  the  most  influence  on  schedule  and 
cost  overruns  [17,  85].  The  Design  Center  is  no  exception  to  the 
problem  of  changing  requirements  which  may  be  another  factor  affecting 
estimating  accuracy  during  program  production.  Accurate  estimating 
is  certainly  much  more  difficult  when  the  object  to  be  estimated  is 
moving  so  often,  and  in  unanticipated  directions. 

There  appears  to  be  a reasonable  distribution  of  skills  across 
activities  and  phases  (Figures  15  through  18,  pp.  106  and  107).  Only 
two  exceptions  are  apparent: 

1.  the  comparatively  large  amount  of  programmer  hours 
expended  (more  than  twice  as  many  as  any  other 
skill.  Figures  17  and  18,  pages  106  and  107)  [122], 
and 

2.  the  almost  Insignificant  amount  of  support  hours 
portrayed  in  Figure  18  (p.  107). 

Robert  Barry  suggests  that  valuable  resources  are  wasted  on  the 
wrong  part  of  the  problem  solution,  i.e.  coding  [12].  Apparently 
the  programmer  skill  is  being  excessively  expended  to  compensate 
for  Inadequate  analysis.  Incomplete  specifications,  lack  of  change 
control  and  inexperienced  software  development  management  [74]. 

The  question  posed  earlier  in  this  research  suggested  that  if  tradi- 
tionally only  15%  of  the  development  time  is  spent  on  coding,  then 
the  productivity  of  programmers  is  not  an  issue  deserving  so  much 
research.  Nevertheless,  a continuing  and  more  detailed  investi- 
gation of  this  disproportionate  expenditure  of  programmer  hours 
should  be  conducted.  That  is  not  to  say  that  it  is  programmer 
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productivity  per  se  which  will  resolve  cost  and  estimating  problems. 

■’’he  small  amount  of  support  hours  displayed  do  not  necessarily 
refute  Pietrasanta 's  contention  that  support  Is  a significant  re- 
source factor  consistently  Ignored  in  making  estimates  [115]. 

First  of  all,  the  Implementation  phase  was  eliminated  from  this 
research  and  a considerable  amount  of  support  effort  Is  normally  ex- 
pended during  that  phase.  The  most  Important  reason  for  small 
support  figures  Is  that  the  PARMIS  system  has  never  had  an  adequate 
procedure  for  collecting  support  hours  expended,  nor  for  automati- 
cally adding  overhead  according  to  some  officially  prescribed  al- 
gorithm. The  support  hours  portrayed  are  predominately  for  the 
typing  and  keypunch  efforts  of  clerks. 

An  unusually  apparent  estimating  problem  is  related  to  the  gross 
overestimating  for  activities  in  phase  G (Table  4,  p.  89)  related  to 
developing  and  coordinating  the  user  documentation  (Figures  7 through 
10,  pp.  98-100).  It  is  not  clear  what  factors  induce  this  situa- 
tion but  obviously  these  activities  require  closer  attention  when 
estimates  are  made.  Perhaps  the  fact  that  the  estimates  for  these 
activities  are  made  early  in  the  development  cycle  and  the  work  is 
generally  accomplished  at  the  very  end  of  the  cycle  contributes  to 
the  poor  estimating  record.  Experience  suggests  that  user  docu- 
mentation is  sometimes  done  hurriedly  and  poorly,  in  less  time  than 
should  be  expended,  so  that  deadline  dates  can  be  met. 

The  project  data  in  Table  5 (p.  108)  span  the  spectrum  of 
small  to  large  (new)  software  development  projects  (Figure  19,  p.  115). 
The  completeness  of  planning  across  the  projects  (i.e.,  the  presence 
or  absence  of  specific  development  phases)  is  somewhat  inconsistent 
(Figures  20  through  23,  pp.  116-119).  Note,  project  numbers  22  and 
23  have  no  Programming  (phase  D)  activities  but  do  consist  of  Testing 
activities  (Table  5,  p.  108).  However,  these  are  very  small  projects 
and  may  have  involved  only  documentation  changes  in  conjunction  with 
the  results  of  system  testing.  The  lack  of  this  kind  of  project  de- 
finition Information  In  PARMIS  Is  also  a handicap  to  research.  More 
complete  project  Information  (i.e.  type  project,  actual  and  estimated 
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st.iri  and  complftion  dates,  etc.)  would  be  valuable  In  any  research 
to  analyze  the  impact  of  serious  estimating  errors  on  total  project 
performance.  Drawing  strong  conclusions  regarding  the  projects 
would  not  be  appropriate  because  in  several  cases,  a considerable 
amount  of  data,  in  terms  of  activities,  was  ignored  (implementation 
phase)  or  discarded  (non-standard  activities). 

Apparently,  from  Figures  20  through  23  (pp.  116-119),  the  per- 
centage of  effort  expended  by  development  phase,  or  by  the  "rule  of 
thumb"  classification  is  extremely  inconsistent.  But  then  the 
various '"rules  of  thumb"  tltemse  Ives,  noted  in  Chapter  III,  Literature 
Survey,  are  inconsistent,  varying  as  much  as  10%.  This  disparitv 
in  the  distribution  of  effort  is  particularly  obvious  in  terms  of 
the  number  of  activities  in  the  five  largest  projects  (Figures  19 
through  23,  pp.  115-119;  projects  6,  9,  15,  20  and  30).  Interest- 
ingly, but  not  very  significant,  the  averages  for  the  "rules  of 
thumb"  classifications  fall  within  the  range  of  values,  as  found  in 
the  literature; 

1.  Analysis  (?  23%, 

2.  Programming  0 28%, 

3.  Testing  0 38%, and 

A.  Documentation  0 17%. 

In  the  Design  Center's  classification  of  phases  and  activiL..,.c  (as 
opposed  to  tlio  "rule  of  thumb"  classifications)  only  the  percentages 
for  Programming  (0  46%)  and  Testing  (0  21%)  change,  and  that  is 
because  individual  program  testing  is  accomplished  in  phase  D 
(Figures  21  and  22,  pp.  118  and  119). 

Figure  24  (p.  120)  reflects  the  emphasis  of  the  programmer  skill 
on  these  projects.  Note,  however,  that  some  of  the  smaller  projects 
(Figure  19,  p.  115:  projects  16,  17,  27,  28,  30,  31,  35,  37,  38)  have 
a disproportionately  high  percentage  of  programmer  use.  Intuitively, 
this  observation  is  consistent  with  how  skills  on  small  projects 
might  be  distributed  (l.e.,  the  progranmer  also  becomes  the  analyst) 
(Figures  25  and  26,  pp.  121  and  122). 
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A close  examination  of  the  individual  projects  in  Tables  6 and  7 
(pp.  123  and  125)  reveals  that  the  number  of  activities  satisfying 
the  criteria  (scored  1)  improves  significantly  when  the  difference 
(arithmetic  or  percent)  is  within  one  standard  deviation.  This  is 
not  unexpected  since  the  same  data  used  to  calculate  the  standard 
deviation  were  used  in  the  scoring.  Thereafter,  up  to  twice  the 
standard  deviation  (2  • SD) , the  number  of  activities  scored  does 
not  improve  readily.  This  point  is  more  apparent  in  the  Average 
Number  of  Activities  plus  Phases  scored  per  project  (Evaluation 
Mean)  for  each  criterion,  noted  at  the  end  of  both  tables.  Thus, 
on  the  average,  between  70%  and  83%,  of  one  half  and  three  fourths 
of  the  unique  activities  in  a project,  fall  within  one  standard 
deviation  of  the  arithmetic  and  percentage  differences.  The  above 
range  of  percentages  was  obtained  by  dividing  the  snvillest  and  largest 
average  results  (12.879  and  15.205)  from  Tables  6 and  7 (pp.  123  and 
125)  by  the  average  number  of  unique  activities  (plus  phases) 
per  project  (18.282). 

According  to  the  figures  in  Tables  8 and  9 (pp.  127  and  128) 
there  is  no  indicative  correlation  between  the  internal  and  external 
variables.  The  number  of  Selected  Activities  required  to  attain  a 
minimum  of  97%  of  the  predictive  capability  associated  with  specific 
activities,  in  any  one  run,  never  exceeds  fourteen  (14)  items  and 
is  frequently  less.  Selected  Activities  are  generally  consistent 
across  the  24  different  tests.  This  observation  is  apparent  from 
examining  Tables  8 and  9,  and  even  more  obvious  from  the  summarized 
data  in  Table  10  (p.  130).  The  first  four  items  (16,  26,  15,  61)  in 
Table  10  can  be  said  to  be  good  indicators  of  estimating  accuracy. 

That  is,  if  the  difference  between  the  estimated  and  expended  hours 
of  most  (50%  to  75%)  of  the  specific  activities  in  a project  are 
within  one  standard  deviation  of  the  differences  for  these  activities, 
then  it  can  be  anticipated  that  70%  to  83%  of  most  of  the  unique 
activities  in  the  project  will  also  be  within  one  standard  deviation. 
It  is  equally  Interesting  to  note  that  there  is  little  change  in  the 
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most  predictive  activities  regardless  of  whetlier  the  arithmetic  or 
percentage  standard  deviations  were  used  for  scoring. 

The  single  most  predictive  activity  is  Item  Number  16,  identi- 
fied from  Table  11  as  activity  C070,  Defining  Input  Data  (Media 
and  Volume).  The  complete  definition  for  this  activity  is  contained 
in  Appendix  2 and  should  be  examined  carefully.  There  is  an  intui- 
tive appreciation  for  the  conspicuous  emergence  of  sucii  an  activity. 
This  is  the  work  which  has  the  most  to  do  with  how  well  the  re- 
mainder of  a software  development  effort  will  proceed  in  terms  of 
estimating  accuracy.  The  importance  of  carefully  defining  the  system 
input,  as  well  as  accurately  estimating  this  effort,  is  further  sub- 
stantiated by  the  frequent  occurrence  of  Item  13  (activity  C040, 
Identify  Data  Base  Requirements/Data  Elements/Codes)  as  a prominent 
indicator.  Finally,  this  research  confirms  emphatically  and  quanti- 
tatively what  software  development  practitioners  have  long  known: 
that  Inadequate  estimates  of  the  work  required  for  detailed  system 
definition  will  usually  render  inaccurate  the  remaining  manpower  cost 
and  schedule  estimates.  This  contention  is  supported  by  the  pro- 
minence of  Items  16,  13,  11,  12  and  13  In  Table  10  (p.  130). 

The  emergence  of  Item  Number  61  (activity  G120)  ia  not  unusual 
in  light  of  the  previous  observations  (Figures  7 and  11,  pp.  98  and 
102)  of  estimating  accuracy  within  the  User  Documentation  phase  (G>. 
Gross  overestimates,  of  this  activity  in  particular,  create  an  ex- 
tremely large  standard  deviation  within  which  most  of  the  occurrences 
of  activity  difference  would  fall.  Thus  this  activity  would  be 
scored  one  (1)  more  often  than  not.  Since  the  activity  occurs  late 
In  the  development  cycle  it  ij  not  of  much  value  In  terms  of  Identi- 
fying estimating  problems  early  enough  for  management  actions  to  be 
greatly  effective.  Furthermore,  the  range  of  the  standard  deviation 
creates  some  apprehension  in  using  the  estimating  accuracy  on  this 
activity  as  an  indicator  for  even  the  remainder  of  the  latter  stages 
of  development.  Nevertheless,  the  cunspicuousness  of  this  activity, 
in  conjunction  with  the  previous  observations,  should  act  as  a clear 
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warning  signal  to  management  regarding  estimates  in  this  important 
part  of  software  development. 

The  last  six  (6)  Item  Numbers  (67-72)  appearing  In  the  sample 
SEQUIN  output  corresponded  to  the  gross  activities  used  In  "small" 
projects  (KlOO  through  K500  plus  Summary  phase).  No  data  were  pro- 
cessed for  projects  using  these  activities  and  therefore  the  Item 
Numbers  always  received  a zero  (0)  score  and  had  no  effect  on  the 
final  results.  The  summarized  information  pertaining  to  these 
activities  was  retained  in  Table  4 (p.  89). 
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Table  11. 


Item  Number  - Activity  Identification 


Item 

Number 

Phase/ 

Activity 

Code 

Definition 

1 

BlOO 

Research  Problem  Requirement 

2 

BllO 

Economic  Analysis 

3 

B200 

Prepare  Data  Automation  Request  (DAR) 

4 

B210 

Review/Coordinate/Approve  DAR 

5 

B300 

Prepare  Data  Project  Directive  (DPD) 

6 

B310 

Revlew/Coordin.ite/Approve  DPD 

7 

B400 

Prepare  Data  Project  Plan  (DPP) 

8 

B410 

Revlew/Coordlnate/Approve  DPP 

9 

BOOO 

Summary  Phase  B 

10 

COlO 

Define  System  Concepts 

11 

C020 

Define  Interface/Integration  Requirements 

12 

C030 

Review/Coordinate  Interface/Integration  Require- 

ments 

13 

C040 

Identify  Data  Base  Requirements/Data  Elements/ 

Codes 

14 

C050 

Define  Detailed  System  Requirements  and  Objectives 

15 

C060 

Flow  Chart  System  Processes 

16 

C070 

Define  Input  Data  (Media  and  Volume) 

17 

COSO 

Identify  Timing  Factors  (simulation) 

18 

C090 

Analyze  System  Optimization 

19 

ClOO 

Define  Audit  Trail  Requirements 

20 

Clio 

Coordinate  System  With  User 

21 

C120 

Document  System  Specifications 

22 

Cl  30 

Define  Detailed  System  Processes 

23 

C140 

Prepare  Detailed  System  Flow  Charts 

24 

C150 

Prepare  and  Review  Program  Specifications 

25 

C160 

Prepare  System  Test  Data 

26 

C161 

System  Design  Review 

27 

COOO 

Summary  Phase  C 

28 

DOlO 

Program  Analysis  and  Flow  Charting 

29 

D020 

Code 

30 

D030 

Keypunch 

31 

D040 

Compile/ Assemble 

32 

D050 

Desk  Check/Debuf 

33 

D071 

Prepare  Test  Data 

34 

D090 

Test  Program 

35 

D120 

Test  Program  In  Subsystem 

36 

D130 

Prepare  Program  Folder  (Vol.  IV  Documentation) 

37 

DOOO 

Summary  Phase  D 
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Table  11.  (continued) 


Item 

Number 

Phase/ 

Activity 

Code 

Definition 

38 

EOlO 

Test  System 

39 

E020 

Debug  System 

40 

E050 

User  Review/ Coordinate/Approve 

41 

E070 

Operational  Test/User  Approval 

42 

E080 

Prepare  System  for  Release  to  Operations  Direc- 
torate 

43 

EOOO 

Summary  Phase  E 

44 

F500 

Announce  Documentation  Release  In  USAF  Publica- 
tion 

45 

F520 

Prepare  Vol.  I Documentation-Operations  Manual 
(Run  Book) 

46 

F530 

Type  Vol.  I Draft 

47 

F540 

Coordinate  Vol.  I with  User 

48 

F550 

Prepare  Vol.  I for  Quality  Control  and  Environ- 
mental Test 

49 

F570 

Correct  Vol.  I 

50 

F620 

Prepare  Vol.  II  Documentation  - Program  Docu- 
mentation Specifications,  etc. 

51 

F630 

Type  Draft 

52 

F640 

Coordinate 

53 

F650 

Prepare  for  Quality  Control  and  Environmental 
Test 

54 

F670 

Correct 

55 

F720 

Prepare  Documentation  - Unique  for  Batch  Type 
Systems  and  Computers 

56 

F730 

Type 

57 

F740 

Coordinate 

58 

F750 

Prepare  for  QC 

59 

F770 

Correct 

60 

FOOO 

Summary  Phase  F 

61  G120  Prepare  Vol.  Ill  Documentation  - Functional  Users 

Manual 

62  G130  Type  Draft 

63  GI40  Coordinate 

64  G150  Prepare  for  QC 

65  G170  Correct 

66  GOOD  Summary  Phase  G 


Table  11.  (continued) 


Phase/ 

Item  Activity 

Number  Code  Definition 


67 

KlOO 

Feasibility  - Small  Projects 

68 

K200 

Analysis 

69 

K300 

Programming 

70 

K400 

Test 

71 

K500 

Documentation 

72 

KOOO 

Summary  Phase  K 
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CHAPTER  VII 

CONCLUSIONS / RECOMMENDATIONS 

The  Engineering  Approach 

Change  is  the  predominant  characteristic  of  the  computer  industry 
1108].  The  people  associated  with  the  computer  science  discipline  are 
generally  classified  with  engineers  because  the  computer  itself  is  the 
most  predominant  symbol  of  "the  Industry"  and  because  it  is  recognized 
as  an  engineering  marvel.  On  the  other  hand,  at  the  1969  Conference 
sponsored  by  NATO,  entitled  "Software  Engineering,"  it  was  admitted 
that : 

The  phrase  "Software  Engineering"  was  deliberately  chosen 
as  being  provocative,  in  implying  the  need  for  software 
manufacture  to  be  based  on  the  types  of  theoretical 
foundations  and  practical  disciplines  that  are  traditional 
in  the  established  branches  of  engineering.  [106] 

Hardware  and  software  are  constantly  achieving  remarkable  strides 
in  making  the  power  of  the  computer  more  beneficial  to  mankind  and 
more  easily  available.  There  is,  however,  a significant  departure 
between  hardware  and  software  in  terms  of  how  this  progress  is 
achieved.  Hardware  progresses  in  a scientific  manner  through  research 
and  development  and  the  application  of  engineering  principles.  Soft- 
ware development  progresses  by  brute  force  and  trial  and  error,  and 
its  development  generally  proceeds  in  an  undisciplined  manner.  In 
short,  the  software  development  process  has  fallen  far  behind  the 
scientific  approach  used  for  hardware  development  and  may  restrain 
the  effective  use  of  hardware  capabilities  in  the  future  [66,  132]. 

A New  Development  Model 

The  foremost  conclusion  from  this  research  is  that  estimation  of 
software  development  continues  to  be  as  bad  as,  or  worse  than,  it  has 
ever  been.  The  process  used  for  software  development  is  Incompatible 
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with  the  classical  estimating  techniques  which  managers  attempt  to 
superimpose  on  this  process  (97].  That  is  not  to  say  tliat  there 
are  not  standard  activities  which  must  be  estimated  and  accomplished 
in  some  sequential  order  to  successfully  develop  and  implement  soft- 
ware. The  structure  of  this  process  is  faulty  when  combined  with 
the  estimating  objective  of  trying  to  predict  the  cost  and  span  time 
for  the  entire  project  development  [30].  These  observations  deserve 
further  elaboration. 

Primarily,  the  process  of  software  development  should  be  re- 
structured into  a new  Research  - Development  - Production  (R-D-P) 
Model.  The  R-D-P  environment  is  used  by  Industry  and  government  in 
the  development  of  new  products,  advanced  technology  (NASA),  or  a 
new  weapon  system  (C5-A  aircraft)  [27].  New  software  development  is 
no  lest  complex  than  any  of  these  endeavors.  In  both  R&D  and  soft- 
ware development,  heavy  expenditures  of  resources  are  required  in 
order  to  produce  future  benefits  that  are  very  uncertain  in  their 
magnitude.  Because  of  long  software  project  lives,  attempts  at 
total  project  planning  and  cost  estimation  are  very  Inaccurate.  The 
research  aspect  of  softwaye  development  also  significantly  affects 
the  accuracy  of  software  project  estimates. 

Thayer  believes  that  it  is  impossible  to  analyze  and  put  into 
writing  what  a man  does  to  manage  a Command  and  Control  System.  For 
example,  despite  some  $65  million  spent  in  writing  the  software  for 
the  Strategic  Air  Command's  465L  Command  and  Control  System,  95%  of 
the  system  had  to  be  rewritten  due  to  poor  initial  analysis  and 
hastily  conceived  assumptions  [146].  The  results  of  this  disserta- 
tion once  again  confirmed  that  the  lack  of  preparation  and  definition 
of  work  during  Feasibility  Studies  and  Requirements  Analysis  (pre- 
liminary system  design)  results  in  the  wild  variance  of  estimates 
for  these  efforts.  The  activities  in  these  two  phases  should  be 
classified  as  Research  in  lieu  of  the  classical  phase  structure 
which  has  predominated — and  which  has  consistently  failed  as  a frame- 
work for  accurate  estimating.  The  Development  of  software  should 
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encompass  the  detailed  system  design  (specifications),  program  de- 
velopment, testing  and  documentation.  Production  should  involve 
those  activities  required  In  Implementation,  Evaluation  and  Modifi- 
cation. Maintenance  of  software  is  a continuing  process  and  should 
remain  separately  funded  and  controlled  apart  from  the  evolution  of 
the  original  system.  It  is  not  critical  at  this  point  to  be  overly 
concerned  about  which  specific  activities  belong  in  which  R-D-P 
phase.  These  classifications  will  change  with  experience  and  there 
will  always  be  interaction  between  even  these  new  phases,  and  some 
iteration  within  the  total  process  [54]. 

While  it  is  doubtful  that  the  software  development  process  can 
ever  be  delineated  into  clear  cut  phases  with  precise  cut  off  points, 
the  estimation  of  resources  and  time  requirements  for  the  phases  of 
R-D-P  can  be  restricted.  Hopefully,  this  R-D-P  approach  will 
heighten  managements'  awareness  of  the  complexity  of  what  has  to  be 
done  and  eliminate  the  historically  impossible  requirement  to  accur- 
ately estimate  the  cost  and  time  for  the  entire  software  development 
project.  Furthermore,  this  approach  would  restrain  the  rush  to  be- 
gin coding  [13]. 

The  Research  phase  should  be  more  finitely  defined  in  terms  of 
standard  activities.  In  fact,  the  National  Bureau  of  Standards  should 
be  as  concerned  with  standardizing  this  process  as  it  has  been  with 
standardizing  computer  languages.  Adequate  planning  and  report  flex- 
ibility can  be  easily  Incorporated  with  the  standard  activities,  as 
is  done  in  PARMIS,  to  satisfy  individual  managers  and  existing 
conmercial  project  control  systems.  The  Research  phase  should  be 
estimated  Independently  and  the  estimates  for  the  subsequent  Develop- 
ment and  Production  should  not  be  attempted  until  the  Research  phase 
has  been  brought  to  a mutually  satisfactory  conclusion.  Precise 
objectives  and  end  products  should  be  specified  for  most  activities 
and  certainly  for  each  phase.  These  new  classifications  of  effort 
provide  realistic  management  decision  points  and  a basis  for  more 
accurately  estimating  the  next  phase. 


Unfortunately,  these  recommendations  are  certainly  not  revolu- 
tionary Ideas.  They  have  been  proposed  by  other,  more  renowned, 
authors  since  1963  [88,  13,  106].  It  Is  Inconsistent  for  an  In- 
dustry so  dependent  on  change  to  be  so  resistant  to  change.  This 
resistance  Is  particularly  confounding  In  an  area  of  the  science 
(l.e.,  software  engineering)  which  has  proven  time  and  again  that 
Its  primary  function  (reliable  software  development)  cannot  be 
controlled  In  terms  of  cost  and  time.  Quality  research  and  good 
management  practices  are  not  Incompatible. 

Supporting  the  Hypothesis 


The  findings  of  this  research  support  the  original  hypothesis. 
Specific  resource  consuming  activities  have  been  isolated  which  are 
consistently  good  Indicators  of  whether  a software  development 
project  is  accurately  estimated.  These  activities  were  Identified 
as  a result  of  quantitative  analysis.  Furthermore,  the  Isolated 
activities  have  an  Intuitive  and  logical  appeal  since  they  are 
predominately  concerned  with  Initial  efforts  at  defining  the  system 
in  terms  of  the  data  elements  and  files  to  be  developed.  The  accuracy 
of  the  original  estimates  In  defining  the  effort  of  these  complex 
tasks  obviously  has  a direct  impact  on  the  accuracy  of  the  sub- 
sequent resource  estimates  for  the  remaining  activities  in  software 
development.  Considerable  software  development  experience  has  been 
cited  throughout  this  study  confirming  that  when  these  types  of 
analysis/design  activities  must  be  reaccomplished,  large  schedule 
and  cost  overruns  result.  Software  development  management  Is 
therefore  liable  to  be  knowledgeable  of,  and  to  concentrate  their 
management  attention  and  efforts  on  these  Important  predictor  activi- 
ties. Original  estimates  for  these  indicative  activities  should  be 
given  careful  consideration.  Subsequent  software  development 
planning  should  obviously  be  updated  In  light  of  the  results  of  the 
estimating  accuracy  on  these  Illuminating  tasks.  Fortunately  these 
Indicative  activities  occur  early  enough  In  the  development  cycle 
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where  honest  admissions  of  difficulty  still  leave  sufficient  oppor- 
tunities for  various  management  actions  (l.e.,  reduced  objectives, 
etc.)  which  could  save  the  schedule,  the  project,  the  contract  and/ 
or  personnel  morale. 

Chapter  I,  Introduction,  posed  a question  concerning  a correla- 
tion between  specific  activity  estimating  accuracy  and  the  final 
project  estimated  and  expenditure  totals.  SEQUIN,  using  these  data, 
could  find  no  correlation  between  its  Objective  Function  and  the 
external  variable  used  (the  Inverse  of  the  final  arithmetic  differ- 
ence). Clearly  some  relationship  must  exist  between  accurately 
estimating  Individual  activities  and  the  difference  between  the 
project's  final  total  estimate  and  expenditure.  It  is  conceivable 
that  either  the  predominance  or  overestimates  in  this  data  source 
obscured  such  a correlation  or  that  an  appropriate  external  variable 
was  not  identified  in  the  data  base. 

Centralized  Control 

Worthy  of  additional  emphasis  is  the  problem  associated  with 
the  limitations  of  data  sources  dealing  with  estimating  the  soft- 
ware development  process.  Data  are  now  available  in  numerous 
management  information  systems.  The  data  elements  used,  the  develop- 
ment process,  and  definitions  of  the  individual  activities  in  the 
software  development  environment,  pertinent  to  project  planning, 
progress,  and  change,  must  be  standardized.  It  is  recommended  that 
a standardized,  PARMIS  like,  system  be  centrally  established  for 
controlling  most  software  development  done  by  and  for  the  U.S. 
Government.  A precedent  for  this  type  of  centralized  function  was 
established  with  the  evolution  of  the  Federal  Simulation  Laboratory 
(FEDSIM)  in  Washington,  D.C.  This  organization,  under  operational 
control  of  the  U.S.  Air  Force,  conducts  simulations  of  automated 
computer  systems  for  all  Federal  agencies.  Clearly,  this  approach 
would  reduce  the  costa  of  the  many  project  control  systems,  manual 
or  automated,  now  operated  by  the  various  government  agencies 
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involved  in  software  development.  Input  could  be  mailed  or  dis- 
patched over  government  data  networks.  On-line  access  could  be 
provided  for  Immediate  status  inquiries.  Standard  management  reports 
would  not  be  required  more  than  semi-monthly;  therefore  response 
time  should  not  be  an  insurmountable  problem.  An  Important  con- 
sideration would  be  to  limit  the  system  to  project  status  reporting 
and  accumulation  of  valuable  historical  data  as  opposed  to  man-hour 
accounting.  Mathematical  tools  and  management  analysis  techniques 
could  be  used  to  continually  analyze  and  refine  the  historical  data. 
Significant  planning  factors  might  be  isolated  and  time  and  cost 
estimating  guides  could  be  identified  and  fed  back  to  the  agencies. 

At  least  some  continued  monitoring  of  the  estimating  variability  of 
specific  activities  would  tend  to  narrow  the  difference  between 
estimates  and  expenditures.  A repository  for  sharing  software  devel- 
opment experience,  both  good  and  bad,  could  be  established.  Standard 
software  evaluation  techniques  could  be  developed  for  use  at  various 
development  stages  as  opposed  to  waiting  until  the  system  test  for 
such  evaluations.  The  possibilities  for  all  kinds  of  improvements 
in  research,  software  development,  project  control,  cost  visibility, 
estimating  accuracy,  reliability  and  management  are  numerous. 

Research  and  development  of  the  software  development  process  Itself 
would  certainly  be  enhanced  178]. 

The  rapid  strides  of  professional  progress  come  when  the 
structure  and  principles  that  integrate  individual  ex- 
periences can  be  identified  and  taught  explicitly  rather 
than  by  indirection  and  diffusion.  The  student  can  then 
inherit  an  intellectual  legacy  from  the  past  and  build 
his  own  experience  upward  from  that  level,  rather  than 
having  to  start  over  again  at  the  point  where  his  pre- 
decessors began.  [72] 

Recoomiendat  ions 

The  remaining  recommendations  require  no  elaborate  discussion 
and  need  only  be  listed. 

1.  Planning  Is  crucial  to  estimating  accuracy.  As  demonstrated 
by  Adams  in  his  dissertation  [2],  a professional  conscientious  effort 
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must  be  made  to  consider  all  resource  consuming  activities  and  to 
make  some  estimate  of  the  magnitude  of  the  effort  regardless  of  the 
amount  of  Information  available. 

2.  Dynamic  planning  Is  essential.  Plans  and  estimates  must  be 
realistically  updated  as  more  accurate  Information  becomes  available 
in  conjunction  with  periodic  project  status  reviews.  Update  planning 
is  a management  function — not  a management  failure!  Pushing  people 
to  attain  unrealistic  goals  destroys  morale,  adversely  influences 
system  reliability  and  encourages  false  progress  reporting.  Product, 
design  and  status  reviews  should  be  planned  activities. 

3.  Improve  reporting  of  productive  work.  Better,  more  encom- 
passing methods  for  capturing  all  resource  expenditures  associated 
with  a specific  software  development  project  are  required. 

4.  Measurement  of  software  development  progress  should  not  be 
limited  to  a single  variable.  Several  different  indicators  of 


progress 

should  be  defined: 

(a) 

dates , 

(b) 

nours , 

(c) 

cost , 

(d) 

skill  usage. 

(e) 

support  contributions. 

(f) 

numbers  of  program  tests. 

(g) 

approved  completed  products  such  as  specifications, 

programs  and  chapters. 

(h) 

number  of  design  changes,  and 

(1) 

number  and  magnitude  of  estimate  changes,  etc. 

Effective  management  visibility  requires  that  the  Research  or 
Development  efforts  be  amenable  to  review  at  anytime  with  little  or 
no  preparatory  cost  [97]. 

5.  New  data  elements  are  required.  Project  control  systems 
need  additional  information  to  enhance  interpretations  of  project 
progress,  to  aid  in  project  control,  and  for  subsequent  research. 
Such  data  should  at  least  include  validated  reasons  for  changes  to 
plans  and  estimates  and  short  definitions  of  the  project  objectives. 
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6.  The  data  system  specifications  are  the  most  Important  pro- 
duct generated  during  the  software  development  process.  Tlie  term 
"software  development"  encompasses  the  specification  of  all  pro- 
cedures to  be  accomplished  by  both  man  and  machine.  McHenry  [97] 
has  postulated  an  alternative  software  development  model  which  Is 
based  on  the  recognition  of  programming  as  a specifying  activity. 

The  postulated  process  is  a specification  continuum.  The  process 
concludes  when  the  specifications  satisfy  a test  procedure  by  execu- 
tion. The  postulation  has  as  its  justification  the  continuous 
potential  for  managerial  testing,  measuring,  reviewing  and  controlling. 
The  R-D-P  Model  proposed  by  this  dissertation  and  McHenry's  Specifi- 
cation Model  are  totally  compatible  and  complementary. 

7.  The  development  of  user  documentation  has  an  unusually  sig- 
nificant influence  on  software  development  estimating  accuracy. 

This  observation,  combined  with  the  history  of  grossly  overestimating 
this  activity,  warrants  a recommendation  that  more  management  atten- 
tion be  given  to  estimating  this  phase  of  development. 

8.  New  techniques  for  organization  and  technical  software 
development  such  as  Chief  Programmer  Teams  and  structured  programming 
should  continue  to  be  exploited  and  enhanced. 

9.  Activity  and  project  estimates  should  be  classified  according 
to  some  scheme  which  evaluates  the  reliability  (risk,  confidence)  of 
the  estimate. 

10.  Software  estimating  review  groups  should  be  established 
similar  to  those  used  at  TRW  and  described  by  Wolverton  [158]. 

Tracking  the  estimating  accuracy  of  these  groups  as  well  as  deter- 
mining the  reasons  for  poor  estimates,  must,  in  the  long  run,  improve 
estimating  performance. 

11.  The  various  techniques  for  developing  more  reliable  soft- 
ware, and  controls  to  insure  a reasonably  accurate  range  for  the 
time  and  cost  variables,  need  to  be  interrelated  and  combined  into 
a total  organization  control  system  concept.  Software  managers 
cannot,  however,  rely  completely  on  new  system  analysis,  programming 
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and  testing  technologies,  organizational  innovations  and  control  sys- 
tems. The  primary  factor  in  the  total  software  development  manage- 
ment equation  is,  and  will  always  be,  people.  Recommendations  on 
how  to  best  manage  the  pe..ple  factor  in  software  development  is  a 
most  complex  issue  which  even  Weinberg  [151]  has  only  begun  to 
address . 

12.  Certainly  additional  studies  can  now  be  initiated  using 
similar  techniques,  with  other  data  sources,  to  isolate  the  activities 
pertinent  to  those  sources  which  significantly  influence  software 
development  project  estimates.  Additional  important  correlations  may 
be  discovered  between  specific  activities  and  overall  project  success 
in  terms  of  estimating  accuracy.  The  excessive  experditure  of  pro- 
grammer resources  should  be  examined  in  greater  detail.  There  is 
still  much  to  discover  about  the  software  development  process,  and, 
as  with  all  other  basic  research, that  knowledge  will  not  all  be 
uncovered  by  any  single  study.  At  least  applying  the  knowledge  and 
recommendations  gained  from  this  research  might  well  yield  a marked 
improvement  in  future  software  development  project  estimating  efforts. 

Summary 

Peter  Drucker  has  succinctly  described  the  problem  of  the 
transition  of  the  data  automation  profession  into  a controlled 
development  cycle. 

To  make  knowledge  work  productive  will  be  the  management 
task  of  this  century,  just  as  to  make  manual  work  pro- 
ductive was  the  great  management  task  of  the  last  century. 

The  gap  between  knowledge  work  that  is  managed  for  pro- 
ductivity and  knowledge  work  that  is  left  unmanaged  is 
probably  a great  deal  wider  than  was  the  tremendous 
difference  between  manual  work  before  and  after  the 
introduction  of  sclenfltlc  management.  [47] 

Nevertheless,  the  transition  must  be  made.  The  sharing  of  ex- 
perimental control  systems  data,  needed  research  to  discover  better 
ways  of  planning  and  estimating  the  software  development  cycle,  and 
the  advent  of  a more  knowledgeable  software  project  manager  are 
some  of  the  possible  solutions  which  must  be  pursued.  This  is 
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dependent  of  course  on  whether  the  "software  engineer"  wants  to 
control  the  software  development  process,  or  whether  the  process  will 
continue  to  control  him. 
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APPENDIX  1 


STANDARD  PHASE  DEFINITIONS 


Bxxx:  FEASIBILITY  STUDY:  Involves  the  tasks  associated  with  Economic 

Analysis,  Data  Automation  Requirement  (DAR)  Preparation,  Data  Project 
Directive  (DPD)  Preparation,  and  Data  Project  Plan  (DPP)  Preparation  as 
prescribed  by  AFM  300-12. 

Cxxx:  SYSTEM  ANALYSIS:  Involves  those  detailed  tasks  associated  with 
System  Analysis  in  the  design  or  modification  of  a major  automated  data 
system  (ADS) . 

Dxxx:  PROGRAMMING:  Involves  those  detailed  tasks  associated  with  the 
creation  of  a computer  program. 

Exxx:  SYSTEM  TEST:  Involves  the  tasks  associated  with  testing  the 
entire  ADS  or  a specific  subsystem  of  an  ADS. 

F5xx:  PRELIMINARY  DOCUMENTATION— VOLUME  I:  Involves  the  writing  and 
preparation  for  publication  of  the  preliminary  version  of  an  AFM  171 
series  volume  I manual. 

F6XX:  PRELIMINARY  DOCUMENTATION— VOLUME  II:  Involves  the  writing  and 
preparation  for  publication  of  the  preliminary  version  of  an  AFM  171 
series  volume  II  manual. 

F7xx:  DOCUMENTATION-NON-B3500:  Involves  the  writing  and  preparation  for 
publication  of  the  preliminary  versions  of  AFM  171  series  Volumes  I,  II, 
III,  and  IV  manuals  for  other  than  B3500  equipment. 

Gxxx:  DOCUMENTATION- -VOLUME  I I I/FUNCTIONAL  USER  SUPPORT  MANUAL:  In- 
volves the  writing  and  preparation  for  publication  of  the  preliminary 
version  of  an  AFM  171  series  volume  III/Functional  User  Support  Manual. 

Hxxx:  SYSTEM  RELEASE /ENVIRONMENTAL  SYSTEM  TEST:  This  event  group 
creates  a reference  point  for  all  program/system  releases  and  will 
facilitate  the  collection  of  resource  expenditures  by  project  for  those 
functions  performed  by  the  Directorate  of  Systems  Control  (SC).  This 
event  group  also  provides  the  events  necessary  to  identify  mandatory 
milestone  dates,  such  as  entry  Into  Environmental  System  Test  and 
scheduled  Air  Force  release  dates.  (See  chapter  2,  paragraph  2-llb 
for  full  explanation  of  procedures  and  use  of  this  event  group.) 

Ixxx:  IMPLEMENTATION/CONVERSION:  Involves  those  tasks  associated  with 
the  Implementation  or  conversion  of  automated  data  systems  and  com- 
puter hardware  and  software. 

X35.2:  INDEPENDENT  MINOR  PROJECT:  A project  Involving  events  which 
will  require  approximately  0 to  500  man-hours  of  effort  for  completion. 
(See  chapter  2,  paragraph  9c  for  full  explanation  of  Independent  minor 
projects  and  their  usage.) 
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APPENDIX  2 


STANDARD  ACTIVITY  DEFINITIONS 


BIO.O:  RESEARCH  PROBLEM/REQUIREMENT:  This  event  should  be  used  to 
accumulate  the  time  expended  In  all  preliminary  research  and  analysis 
associated  with  the  problem/ requirement  under  review.  Included  in 
this  research  is  a general  survey  for  the  purpose  of  determining  the 
operational  and  technical  desirability  of  automating  an  application, 
as  well  as  a complete  description  of  the  existing  system.  The  sur- 
vey should  clarify  the  objectives  and  requirements  of  the  proposed 
automated  data  system  application/modification  and  explore  all  implica- 
tions of  the  problem/requirement.  Also  included  is  the  collection  of 
data  pertaining  to  existing  procedures,  policies,  and  the  preparation 
of  reports  or  position  papers  pertaining  to  these  policies/procedures. 
Sources  of  information  are  current  reports,  files,  work  flow  charts, 
organizational  charts,  and  regulations/manuals. 

Bll.O:  ACCOMPLISH  ECONOMIC  ANALYSIS:  This  event  should  be  used  to 
accumulate  resource  expenditures  associated  with  the  processes  des- 
cribed in  AFM  300-12  which  relate  to  economic  analyses  for  proposals 
which  apply  to  an  automated  data  system  (AOS).  This  process  is  an 
iterative  one  in  which  there  is  no  logical  sequence  of  steps,  but 
should  generally  address,  to  the  degree  practicable,  the  identification 
of  the  problem,  description  of  the  relevant  environment,  postulation  of 
objectives,  identification  of  assumptions  and  constraints,  postulation 
of  alternatives,  cost  estimates,  identification  of  benefits,  and  com- 
parison of  alternatives. 

B20.0:  PREPARE  DAR:  Includes  time  expended  in  the  actual  preparation 
of  the  document  which  describes  the  need  requiring  management  attention. 
The  procedures  in  preparing  this  document,  and  the  format,  are  con- 
tained in  AFM  300-12. 

B21.0:  REVIEW/COORDINATE/APPROVE  DAR:  Includes  time  expended  in  the 
review  and  coordination  process  associated  with  each  DAR  prepared  by 
AFDSDC.  The  estimated  completion  date  of  this  event  should  indicate 
the  date  which  it  is  anticipated  that  the  DAR  will  be  submitted  for 
approval. 

B30.0:  PREPARE  DPD:  Includes  time  expended  in  the  preparation  of  the 
DPD  in  accordance  with  AFM  300-12,  attachment  2.  The  estimated  start 
date  for  this  event  should  be  the  anticipated  approval  date  for  the 
DAR,  since  the  DPD  must  be  issued  within  30  days  after  approval  of  the 

DAR. 

B31.0:  REVIEW/COORDINATE/ APPROVE  DPD:  Includes  time  expended  by  the 
Office  of  Primary  Responsibility,  each  project  participant,  and  each 
interested  staff  office  in  coordinating  the  proposed  DPD.  The  esti- 
mated completion  date  for  this  event  should  be  within  20  span  days  of 
the  start  date  of  DPD  Preparation,  and  will  indicate  the  anticipated 
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date  which  the  data  automation  activity  (SYD)  approves  the  DPD  and 
assigns  a DPD  number. 

B40.0:  PREPARE  DPP:  Includes  time  expended  In  the  actual  preparation 
of  the  DPP  In  the  format  prescribed  by  AFM  300-12,  attachment  3. 

B41.0:  REVIEW/COORDINATE/ APPROVE  DPP:  Includes  time  expended  In  re- 
view, coordination,  and  final  approval  of  the  DPP. 

B50.0:  SYSTEM  DESIGN  REVIEW:  Includes  time  expended  to  prepare  and 
conduct  the  Initial  Design  Review  Panel  (DRP) . During  this  review,  the 
responsible  Director  will  brief  the  DRP  as  to  the  purpose,  objectives, 
and  requirements  of  the  system.  The  panel  will  review  the  system  In- 
puts, files,  and  records  to  be  maintained;  provide  a concise  portrayal 
of  the  developed  system  logic  and  processing  steps  (subsystem  flow 
charts,  edits,  etc.);  and  describe  processing  requirements  In  narra- 
tive form.  (NOTE:  The  findings  and  decisions  made  during  this  re- 
view establish  the  basis  for  future  work;  hence,  this  Is  considered  to 
be  the  most  crucial  of  the  DRP  meetings  held  during  the  development 
cycle.  Each  DRP  meeting  has  a separate  and  unique  objective.  (See 
event  C16.1.)  ) 

COl.O:  DEFINE  SYSTEM  CONCEPTS:  Includes  time  expended  to  review  ob- 
jectives established  by  management  during  the  Feasibility  Study,  and 
to  refine  these  objectives  Into  the  detailed  definition  of  information 
requirements  needed  to  complete  system  analysis. 

C02.0:  DEFINE  INTERFACE/INTECRATION  REQUIREMENTS:  Includes  time  ex- 
pended to  Identify  and  document  requirements  for  extraction  of  informa- 
tion from  other  systems,  contribution  of  information  to  other  systems, 
and  extent  of  Interaction  with  other  systems. 

C03.0:  REVIEW/COORDINATE  INTERFACE /INTEGRATION  REQUIREMENTS:  Includes 
time  expended  during  that  period,  subsequent  to  the  definition  of  Inter 
face/integration  definition,  when  the  definition  Is  subjected  to  review 
by  offices  of  primary  and  corollary  Interests.  Includes  any  TDY,  par- 
ticipation in  meetings,  and  preparation  and  coordination  of  corres- 
pondence leading  to  final  approval  of  the  definition. 

C04.0:  IDENTIFY  DATA  BASE  REQUIREMENTS/ELEMENTS/CODES:  Includes  time 
expended  to  Identify  data  base  elements  required  in  the  system  activity 

C05.0:  DEFINE  DETAILED  SYSTEM  REQUIREMENTS  AND  OBJECTIVES:  Includes 
time  expended  to  Identify  and  document  detailed  design  and  performance 
requirements  for  functional  elements  of  Information  processing.  In- 
cludes formatting  of  tables,  tape  layouts,  labels,  and  other  Inter- 
mediate output  media,  and  descriptions  and  proposed  definitions  for 
layouts  and  media  requirements. 

C06.0:  FLOW  CHART  SYSTEM  PROCESSES:  Includes  time  expended  to  portray 
the  flow  of  documents  and  Information  through  the  proposed  system. 
Involves  a review  of  flow  charting  accomplished  during  the  Feasibility 
Study  and  the  addition  of  details  developed  In  actual  system  analysis. 

C07.0:  DEFINE  INPUT  DATA  (MEDIA  AND  VOLUME);  Includes  time  expended 
to  format  documents  used  to  capture  data  for  input  to  the  computer. 
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Associated  with  this  task  is  the  requirement  for  structuring  and  coding 
data  elements;  also,  structuring  and  formatting  Intermediate  and  master 
files.  Identifying  input  media,  and  estimating  data  volume  to  be  anti- 
cinated. 

C08.0;  IDENTIFY  TIMING  FACTORS  (SIMULATION)  (SYO) ; Includes  time  ex- 
pended to  test  system  concepts;  to  identify  timing  factors,  by  use  of 
a mathematical  model;  to  assist  in  the  selection  of  the  most  effective 
and  efficient  method  of  accomplishing  a given  task;  to  validate  or 
disapprove  hypotheses;  or  to  use  one  computer  system  to  mimic  the 
operations  of  another. 

C09.0:  ANALYSE  SYSTEM  OPTIMIZATION:  Includes  time  expended  in  a 
critical  review  of  proposed  inputs,  outputs,  objectives,  and  overall 
system  requirements  to  affirm  that  the  proposed  system  will  most 
effectively  and  efficiently  satisfy  management  goals. 

CIO.O:  DEFINE  AUDIT  TRAIL  REQUIREMENTS:  Includes  time  expended  to 
identify  system  elements  requiring  audit  trail  consideration,  to  pro- 
vide features  to  allow  association  of  records  with  original,  to  devel- 
op controls  needed  to  assure  complete  transaction  processing  and 
record  balancing,  and  to  assure  capability  to  recreate  records  damaged, 
destroyed,  or  lost  during  processing.  (Involves  timely  Auditor  General 
Representative's  Office  coordination.) 

Cll.O:  COORDINATE  SYSTEM  WITH  USER:  Includes  time  expended  in  1^  , 
participation  in  meetings,  preparation  and  coordination  of  corres- 
pondence leading  to  user  approval  of  the  general  system. 

C12.0:  DOCUMENT  SYSTEM  SPECIFICATIONS:  Includes  time  expended  to 
document  general  system  info’‘mation,  specifications,  and  resource 
requirements  that  have  been  reviewed,  refined,  and  coordinated  during 
the  user  review  phase  of  system  analysis. 

C13.0:  DEFINE  DETAIL  SYSTEM  PROCESSES:  Includes  time  expended  to 
identify,  in  proper  format,  the  detailed  requirements  for  system  data 
editing,  input  and  output  fcrmats,  table  and  file  formats,  logical 
and  arithmetic  manipulation,  data  elements  and  codes,  and  control 
break  data  elements. 

C14.0:  PREPARE  DETAIL  SYSTEM  FLOW  CHARTS:  Includes  time  expended  to 
graphically  depict  elements  described  under  Event  C13.0.  Involves 
preparation  of  a schematic  diagram  portraying  the  flow  of  data  linking 
computer  runs.  This  will  be  the  working  copy  of  the  general  system 
flow  chart  to  be  placed  in  volume  I of  either  the  Automated  Management 
Supporting  Data  System  (AMSDS)  documentation  or  in  general  system 
documentation  required  by  AFM  171-10. 

C15.0:  PREPARE  AND  REVIEW  PROGRAM  SPECIFICATIONS:  Includes  time  ex- 
panded to  Identify  the  detailed  requirements  for  system  data  editing. 
Input  and  output  formats,  table  and  file  formats,  logical  and  arith- 
metic manipulations,  and  data  elements.  Also  includes  time  expended 
In  obtaining  a complete  understanding  between  analyst(s)  and  pro- 
<raaMer(B)  Involved  in  the  system  design  effort. 


C16.0:  PREPARE  SYSTEM  TEST  DATA:  Includes  time  expended  to  devise 
meaningful  test  situations  and  conditions  which  will  thoroughly  exer- 
cise all  programs  and  logic  in  the  system.  Anticipated  test  results 
must  also  be  incorporated  in  this  plan.  The  plan  will  also  consider 
resource  requirements  for  the  test  (person: el  and  equipment)  and  pro- 
vide a schedule  for  the  activities  to  be  performed  during  the  test; 
i.e.,  the  tests  themselves,  time  allotted  for  revisions,  validation 
of  documentation,  etc. , all  time  spent  in  accumulating  or  preparing 
test  data  to  exercise  the  logic,  editing,  and  output  requirements  of 
the  entire  system  of  interacting  programs.  Also,  includes  time 
expended  in  keypunching  this  data.  Requirements  established  will  be 
subject  to  revision  during  programmer  desk  check  event. 

C16.1:  SYSTEM  DESIGi!  REVIEW:  Includes  time  expended  to  prepare  and 
conduct  the  DRP  which  is  required  prior  to  formal  programming.  During 
this  review,  the  responsible  Director  will  present  the  final  design 
plan  to  the  panel,  addressing  the  system  processing  requirements  and 
explaining  any  differences  between  this  panel's  final  design  and  the 
approval  design  that  emerged  from  the  previous  review.  The  DRP  will 
insure  that  major  deviations  are  reconciled  prior  to  final  approval 
of  the  system. 

DOl.O:  PROGRAM  ANALYSIS/FLOW  CHART:  Includes  time  expended  by  the 
programmer  to  study  the  requirements  of  the  program  as  outlined  in 
the  specifications  and  to  formulate  the  detailed  logic  needed  to 
satisfy  these  requirements. 

D02.0:  CODE:  Includes  time  expended  to  translate  detail  block  diagrams 
or  required  specifications  into  a specific  programming  language. 

D03.0:  KEYPUNCH:  Includes  time  expended  in  assigning  coded  data  to 
keypunch  personnel,  actual  keypunching,  and  all  actions  leading  to 
keypunch  completion  and  verification. 

DO^.O:  COMPILE /ASSEMBLE:  Includes  time  expended  in  preparing  the  pro- 
gram for  compilation  or  assembly. 

D05.0:  DESK  CKECK/DEBUG:  Includes  time  expended  by  the  programmer, 
at  his  work  place,  checking  his  logic,  coded  instructions,  and  key- 
punching prior  to  initial  program  compilation.  Also  Includes  time 
expended  to  identify  and  correct  logic,  coding,  and  data  errors  de- 
tected during  compilation. 

D07.1:  PREPARE  TEST  DATA:  Includes  time  expended  to  devise  meaning- 
ful test  situations  and  conditions  which  will  thoroughly  exercise  all 
program  subroutines  and  program  logic. 

D09.0:  TEST  PROGRAM:  Includes  time  expended  to  fully  test  program, 
all  subroutines,  and  program  logic.  Also  Includes  time  expended  to 
correct  errors  detected  during  the  test. 

D12.0:  TEST  PROGRAM  IN  SYBSYSTEM:  Includes  time  expended  to  test  the 
program  In  the  environment  where  Interaction  with  other  programs  of 
the  system  Is  required.  Includes  time  e::pended  to  correct  errors  de- 
tected during  the  test. 
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D13.0:  PREPARE  PROGRAM  FOLDER:  Includes  time  expended  in  preparing 
the  program  folder  (volume  IV-B3500). 

EOl.O:  TEST  SYSTEM:  Includes  time  expended  to  prepare,  set  up,  and 
observe  a system  test  and  to  collect  test  inputs  (cards,  tapes,  and 
instructions)  and  outputs  (dumps,  reports,  printout  messages,  etc.)- 

£02.0:  DEBUG  SYSTEM:  Includes  time  expended,  subsequent  to  initial 
system  test,  to  analyze  and  evaluate  test  results;  to  generate  recom- 
mendations for  system  or  computer  program  documentation  or  specifica- 
tion changes;  to  prepare  evaluation  reports;  and  to  correct  errors, 
design  deficiencies,  or  documentation  inaccuracies  revealed  by  the 
test. 

E05.0:  USER  REVIEW/COORDINATION/ APPROVAL:  Includes  time  expended, 
subsequent  to  system  debugging,  in  a review  of  the  overall  system  with 
the  user(s).  Also  includes  an  examination  of  originally  established 
concepts  and  goals,  leading  to  a mutual  agreement  between  designer  and 
user  that  established  requirements  are  satisfied  by  the  system. 

E07.0:  OPERATIONAL  TEST/OPR  APPROVAL:  Includes  time  expended  to 
obtain  OPR  approval  for  th'',  system,  including  operational  testing  if 
required.  Also  includes  the  time  spent  TDY,  participating  in  opera- 
tional field  tests,  participating  in  meetings,  and  preparing/coordin- 
ating correspondence  leading  to  OPR  acceptance  of  the  system. 

E08.0:  PREPARE  SYSTEM  FOR  RELEASE  TO  SC;  Includes  time  expended  to 
prepare  and  coordinate  a system  change  package  for  submission  to  the 
Directorate  of  Systems  Control  (SC) . Necessary  items  include  AF  Forms 
636  and  673,  AFDSDC  Forms  14  and  31,  all  test  data,  test  run  sheets, 
program  Select  Cards,  etc. , as  required  by  AFDSDCR  171-9. 

F50.0:  ANNOUNCE  DOCUMENTATION  IN  USAF  PUBLICATIONS  BULLETIN:  Includes 
time  required  to  formally  announce  supporting  documentation  prior  to 
all  new  system  releases.  This  event  will  also  serve  as  an  initial  re- 
minder when  documentation  efforts  begin.  This  announcement  must  be  made 
early  enough  that  it  appears  in  the  USAF  Publications  Bulletin  for  a 
period  of  45  to  65  days  prior  to  distribution  of  new  system  documenta- 
tion. 

F52.0:  PREPARE  VOLUME  I DOCUMENTATION:  Includes  time  expended  to 
draft  general  system  instructions  for  formal  publication.  Includes 
system  operation  management  documentation  as  defined  by  AFM  171-100, 
paragraph  6-4,  or  section  A documentation  as  defined  by  AFM  171-10, 
paragraph  010203  (general),  050102  (B263) , or  060102  (H800/200) , as 
applicable . 

F53.0:  TYPE  VOLUME  I DRAFT:  Includes  time  expended  to  format,  review, 
correct,  and  type  the  working  draft  of  volume  I documentation. 

F54.0:  COORDINATE  VOLUME  I WITH  AIR  STAFF  OPR:  Includes  time  expended 
to  review  working  draft  or  documentation,  to  meet  with  functional  area 
representatives  concerning  the  draft,  to  prepare  correspondence,  to 
travel,  if  required  to  accomplish  the  review,  and  to  obtain  initial 
Air  Staff  OPR  approval. 
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F55,0:  PREPARE  VOLUME  I FOR  QUALITY  CONTROL /ENVIRONMENTAL  SYSTEM  TEST: 
Includes  time  expended  in  typing  final  draft  copy,  final  review,  making 
corrections,  and  preparing  necessary  forms  for  submission  to  SCCQ. 

F57.0;  CORRECT  VOLUME  I:  Includes  time  expended  to  make  corrections 
to  the  formatted  draft  as  a result  of  Quality  Control/Environmental 
System  Test,  and  to  discuss  proposed  or  required  corrective  action 
with  personnel  concerned. 

F62.0:  PREPARE  VOLUME  II  DOCUMENTATION:  Includes  time  expended  to 
revise  and  define  detailed  system  specifications  and  program  documen- 
tation as  required  by  AFM  171-100,  chapter  6,  paragraph  6^5. 

F63.0:  TYPE  VOLUME  II  DRAFT:  Includes  time  expended  to  prepare  detailed 
specifications  for  formal  publication.  Includes  information  required  by 
AFM  171-100,  chapter  6,  paragraph  6-5. 

F64.0:  COORDINATE  VOLUME  II  WITH  AIR  STAFF  OPR:  Includes  time  expended 
to  review  the  working  draft  of  the  documentation,  to  meet  with  function- 
al area  representatives  concerning  the  draft,  to  prepare  correspondence, 
and  to  travel,  if  required,  to  accomplish  the  review. 

F65.0:  PREPARE  VOLUME  II  FOR  QUALITY  CONTROL/ENVIRONMENTAL  SYSTEM  TEST: 
Includes  time  expended  in  typing  final  draft  copy,  final  review,  making 
corrections,  and  preparation  of  necessary  forms  for  submission  to  SCCQ. 

F67.0;  CORRECT  VOLUME  II:  Includes  span  days  and  hours  required  to  make 
any  corrections  to  the  formatted  draft  as  a result  of  Quality  Control/ 
Environmental  System  Test,  and  to  discuss  proposed  or  required  cor- 
rective action  with  personnel  concerned. 

F72.0:  PREPARE  DOCUMENTATION:  Includes  time  expended  to  develop  and 
maintain  appropriate  system  and  program  folders  at  the  development 
center.  These  folders  serve  as  a basis  for  the  documentation  required 
by  AFM  171-10.  Includes  time  expended  to  develop  and  maintain  the 
manuals,  to  gather,  organize,  and  proofread  the  material,  and  other 
related  tasks. 

F73.0:  TYPE  DRAFT:  Includes  time  expended  to  hand-write,  format, 
review,  correct,  and  type  a working  draft.  Also  Includes  coordination. 

F74.0:  COORDINATE  WITH  AIR  STAFF  OPR:  Includes  time  expended  to  com- 
plete a review  of  the  documentation  and  to  obtain  OPR  approval  for 
system  release.  Includes  making  the  corrections/revisions  resulting 
from  the  OPR  review. 

F75.0:  PREPARE  FOR  QUALITY  CONTROL/ENVIRONMENTAL  SYSTEM  TEST:  Includes 
time  expended  in  typing  final  draft  copy,  final  review,  making  correc- 
tions, and  preparation  of  necessary  forms  for  submission  to  SCCQ. 

F77.0:  CORRECT  DOCUMENTATION:  Include  span  days  and  hours  required  to 
make  any  corrections  to  the  formatted  draft  as  a result  of  Quality 
Control/Environmental  System  Test,  and  to  discuss  proposed  or  required 
corrective  action  with  personnel  concerned. 

G12.0:  PREPARE  VOLUME  II/FUNCTIONAL  USERS'  SUPPORT  MANUAL  DOCUMENTATION: 
Includes  time  expended  to  revise  and  define  detailed  system 


A 


170 


specifications  and  program  documentation  as  required  by  AFM  171-100, 
paragraph  6-6. 

G13.0:  TYPE  VOLUME  III /FUNCTIONAL  USERS’  SUPPORT  MANUAL  DRAFT:  In- 
cludes time  expended  to  prepare  information  required  bv  AFM  171-100, 
paragraph  6-6. 

G14.0:  COORDINATE  VOLUME  III/FUNCTIONAL  USERS'  SUPPORT  MANUAL  WITH  AIR 
STAFF  OPR:  Includes  time  expended  to  send  or  hand-carry  the  draft  of 
the  documentation  to  assure  acceptability  of  the  draft  for  formatting 
an  in-depth  content  review,  to  identify  elements  which  are  not  in 
consonance  with  established  policies,  procedures,  or  standards,  and  to 
properly  format  the  material. 

G15.0:  PREPARE  VOLUME  III/FUNCTIONAL  USERS'  SUPPORT  MANUAL  FOR  QUALITY 
CONTROL/ENVIRONMENTAL  SYSTEM  TEST:  Includes  time  expended  in  typing 
final  draft  copy,  final  review,  making  corrections,  and  preparation  of 
necessary  forms  for  submission  to  SCCQ. 

G17.0:  CORRECT  VOLUME  III/FUNCTIONAL  USERS’  SUPPORT  MANUAL:  Includes 
span  days  and  hours  required  to  make  any  corrections  to  the  formatted 
draft  as  a result  of  the  Environmental  System  Test,  and  to  discuss  pro- 
posed or  required  corrective  action  with  personnel  concerned. 

HOl.O:  PRODUCT  REVIEW:  This  event  will  include  time  expended  in  the 
final  product  review,  required  by  AFDSDCR  171-11,  prior  to  releasing 
system  documentation  and  programs  for  system  release/Environmental 
System  Test . 

H02.0:  ENVIRONMENTAL  SYSTEM  TEST,  PHASE  I:  This  is  a milestone  event 
and  will  not  be  used  to  accumulate  resource  expenditures.  It  is  es- 
tablished solely  to  indicate  the  date  the  Project  Manager  expects  to 
submit  the  system/program  package  to  SCCQ  for  Environmental  System 
Test,  Phase  I.  Phase  I will  normally  be  performed  on  all  B3500 
system  change  packages  prior  to  their  being  released  to  the  field. 
Non-B3500  systems  use  Event  H04.0,  Environmental  System  Test,  Phase 
II.  Span  days  for  this  event  will  be  25,  to  correspond  with  the  estab- 
lished system  release  cycle.  Subevents  H02.1  through  H02.5  will  be 
used  by  SCCQ  to  record  resource  expenditures  in  conducting  Environmental 
System  Test,  Phase  I and  they  will  not  appear  on  the  Active  Schedule. 

H02.1:  SCCR  FUNCTIONS:  This  event  will  include  time  expended  by  SCCR 
in  systems  packages  receipt,  initiating  the  administrative  controls, 
review  for  completeness,  and  all  the  subsequent  administrative  efforts 
involved  in  preparing  the  system  package  for  release. 

H02.2:  PROGRAM/ SYSTEM  REVIEW:  This  event  will  Include  the  time 
associated  with  the  analysis  of  programmed  input/output,  file  condition, 
processing  inter-relationship,  and  environmental  conditions  to  Isolate 
problem  areas  and  aid  in  resolving  difficulties.  Also  included  will  be 
the  time  expended  in  developing  the  test  procedures  and  Insuring  that 
the  programs  are  compatible  with  documentation. 
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H02.3:  PROGRAM/SYSTEM  TEST:  This  event  will  be  utilized  to  record  the 
time  expended  in  testing,  in  an  Integrated  systems  environment,  new 
and  revised  computer  programs  to  Insure  proper  Interface  of  program 
elements,  compatibility  with  operator  procedures,  and  conformance  to 
USAF  standards. 

H02.4:  DEBRIEFING  TEST  RESULTS:  This  event  will  be  used  by  Quality 
Control  personnel  to  record  the  time  associated  with  advising  the 
development  agency  of  the  results  of  the  specific  program/ system  eval- 
uation. 

H02.5:  DOCUMENTATION  REPRODUCTION:  This  event  has  been  established  to 
record  the  man-hours  required  in  preparing  the  preliminary  documentation 
to  accompany  the  program/ systems  release. 

H04.0:  ENVIRONMENTAL  SYSTEM  TEST,  PHASE  II:  This  event  will  be  used  to 
record  all  the  time  expended  in  the  evaluation  of  computer  programs, 
documentation,  and  output  products  (including  their  responsiveness  to 
management's  needs)  in  a live  environment  at  one  or  more  selected  Air 
Force  bases  prior  to  formal  world-wide  release. 

H99.9:  AIR  FORCE  RELEASE:  This  event  will  be  used  by  the  Project 
Manager  to  project  his  best  estimate  of  when  he  expects  the  program/ 
system  to  be  released  Air  Force-wide.  This  is  a milestone  event  and 
will  not  be  used  for  accumulating  productive  hours.  Every  effort  must 
be  made  to  keep  this  date  as  accurate  as  possible,  as  it  will  be  used  for 
planning  purposes  throughout  AFDSDC.  It  may  also  be  used  in  periodic 
releases  to  the  data  processing  installations  to  keep  them  advised  of 
planned  systems  releases. 

K99.8:  FORMAL  DOCUMENTATION:  This  event  will  include  all  time  expended 
by  the  Directorate  of  Systems  Control  in  the  formal  publication  of 
system  documentation,  when  that  effort  is  associated  with  small  PARMIS 
projects  established  using  Event  Groups  X35.0,  X35.1,  and  X35.2.  This 
event  is  required  in  all  projects  established  in  these  event  groups. 

A full  explanation  of  the  procedures  associated  with  the  use  of  this 
event  is  contained  in  chapter  2,  paragraph  2-11  of  this  manual. 

K99.9:  RELEASE:  This  event  will  Include  all  time  expended  by  the 
Directorate  of  Systems  Control  in  conducting  Environmental  System  Test, 
when  that  effort  is  associated  with  small  PARMIS  projects  established 
using  Event  Groups  X33.0,  X35.1,  and  X35.2.  This  event  is  required  in 
all  projects  established  in  these  event  groups.  A full  explanation  of 
the  procedures  associated  with  the  use  of  this  event  is  contained  in 
chapter  2,  paragraph  2-11  of  this  manual. 

101. 0:  WRITE  IMPLEMENTATION/CONVERSION  PLAN:  Includes  time  expended 
to  plan  and  schedule  the  total  Implementation/Conversion  project,  to 
coordinate  the  plan  with  Hq  USAF,  and  to  update  the  plan  as  experience 
is  gained  during  actual  Implementation/Conversion. 

102.0:  WRITE  TRAINING  PLAN:  Includes  time  expen(|ed  to  detail  the 
training  requirements  of  all  personnel  involved  in  the  Implementation/ 
Conversion  effort. 
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103,0:  TRAIN  IMPLEMENTATION/CONVERSION  TEAM:  Includes  time  expended 
to  train  Implementation/Conversion  Team  representatives  from  both  the 
AFDSDC  and  the  major  commands. 

104.0:  TRAIN  BASE  PERSONNEL:  Includes  time  expended  to  train  both  the 
OPR  and  data  automation  personnel  at  each  lead  base  on  actions  required 
during  the  Implementation/Conversion  effort.  This  event  could  appear 
numerous  times  to  reflect  the  schedule  of  lead  base  training  require- 
ments. 

105.0:  SITE  PREPARATION:  Includes  time  expended  by  Design  Center 
personnel  to  Inspect  or  assist  in  the  base  site  preparation  when  new 
equipment  installations  or  large  equipment  enhancements  are  required 
for  system  implementation.  This  event  should  include  time  spent  on 
TDY  for  site  inspection  or  preparation  assistance. 

106.0:  ACCEPTANCE  TESTING:  Includes  time  expended  to  run  diagnostic 
programs  and  routines  to  insure  that  the  installed  hardware  fulfills 
all  expected  requirements. 

107.0:  PRECONVERSION/DATA  COLLECTION:  Includes  time  expended  to 
purify  data  bases,  perform  dummy  edits,  and  accomplish  all  necessary 
data  collection  at  either  the  base  site  or  at  AFDSDC. 

108.0:  CONVERT  FILES  AT  BASE:  Includes  time  expended  to  assist  in  the 
conversion  of  files  at  each  lead  base.  This  event  could  appear  numer- 
ous times  to  reflect  the  schedule  of  the  file  conversion  of  each  in- 
dividual lead  base. 

109.0:  POSTCONVERSION:  Includes  time  expended  in  follow-up  visitations 
to  the  base  site  to  insure  that  all  new  procedures  are  working  as 
planned. 

I 10.0:  LOAD /IMPLEMENT  SYSTEM  AT  BASE:  Includes  time  expended  to  up- 
load the  new  system/modification  at  the  lead  base.  This  event  may 
appear  numerous  times  to  include  time  expended  at  each  lead  base  to 
get  the  system/ subsystem  operating  correctly. 

I 11.0:  POSTINSTALLATION  SYSTEM  SURVEY:  Includes  time  expended  for 
any  needed  final  visit  to  survey  operations  sometime  after  the  new 
system/subsystem  has  been  in  full  operation. 
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VITA 


Lieutenant  Colonel  Philip  F.  Gehring,  Jr.,  U.S.  Air  Force,  was 
bom  on  26  February  1932,  in  Trenton,  New  Jersey,  the  son  of  Mr.  and 
Mrs.  Philip  F.  Gehring.  He  attended  the  U.S.  Naval  Academy  and 
received  a B.S.  degree  in  Engineering  in  1955.  Lt.  Col.  Gehring  has 
served  21  years  in  the  U.S.  Air  Force  with  experience  as  a Logistics 
Officer,  as  a Crew  Comnander  in  the  Strategic  Air  Command  for  the 
ATLAS  E intercontinental  ballistic  missile,  and  as  a Data  Automation 
Staff  Officer.  His  M.S.  was  taken  in  Information  Science  at  the 
Georgia  Institute  of  Technology  in  1966.  He  is  also  an  outstanding 
graduate  of  the  Air  War  College.  Lt.  Col.  Gehring  is  a past  secre- 
tary of  the  Alpha  Chapter  of  Upsllon  Pi  Epsilon  and  is  a member  of 
the  Computer  Society  of  IEEE  and  the  Data  Processing  Management 
Association.  His  permanent  address  is:  7 Clarence  Avenue,  West  End, 
New  Jersey  07740. 
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